Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 ›...

54
Natural Language Dialogue - Future Way of Accessing Information Hang Li Noah’s Ark Lab Huawei Technologies CCIR 2015 Luoyang Aug. 24, 2015

Transcript of Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 ›...

Page 1: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Natural Language Dialogue - Future Way of Accessing Information

Hang Li

Noah’s Ark Lab

Huawei Technologies

CCIR 2015 Luoyang

Aug. 24, 2015

Page 2: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Scientific development depends in part on a process of non-incremental or revolutionary change.

Thomas Kuhn

Page 3: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Development of Information Retrieval

Library Search

Web Search

Natural Language Dialogue ?

1970

1990

2010

Page 4: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Outline

• Information Access through Natural Language Dialogue

• Data-Driven Approach

• Single Turn Dialogue

• Sentence Representation Learning

• Our Approach Using Deep Learning

• Summary

Page 5: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Natural Language Dialogue

Natural language dialogue: most natural way for human machine communication

Information Input: • visual 83% • auditory 11% • others 6%

Information Output: • speech 70-80%

Page 6: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Search: Restricted Single Turn Dialogue

• Query: keywords • Document: bag of words • Query document matching • Ranking of documents • Evaluation: relevance • Learning to rank and learning to match

Page 7: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Question Answering: Simplified Single Turn Dialogue

• Question: question representation • Answer: answer representation • Question answer matching • Evaluation: accuracy • Learning to match

Page 8: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Task Completion through Natural Language Dialogue

• Multi-turn dialogue • Goal: task completion, mostly information access • Evaluation: completion rate / cost • Including traditional search and question answering as special cases

……

Page 9: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Example One: Hotel Booking on Smartphone

P: How may I help you? U: I'd like to book a hotel room for tomorrow. P: For how many people? U: Just me. What is the total cost? P: That would be $120 per night. U: No problem. Book the room for one night, please.

Page 10: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Example Two: Auto Call Center

• U: hello • H: hello, how can I help you? • U: can you tell me how to find

ABC software? • H: please go to this URL to

download • U: how to activate the software? • H: please see this document

Page 11: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Example Three: Trouble Shooting at Telecommunication Network

R: What is your problem? E: Dropped call in area .. R: My diagnosis is as follows Symptom: ….. Likely cause: …. Solution: ….

Page 12: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Related Work

• Academic research

– Rule-based approach, e.g., Eliza System, Weizenbaum 1964

– Reinforcement learning based approach, e.g., Singh et al 1999

– Machine translation based approach, e.g., Ritter et al 2011

• Commercial products

– Apple Siri

– Google Now

– MS Cortana

Page 13: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Outline

• Information Access through Natural Language Dialogue

• Data-Driven Approach

• Single Turn Dialogue

• Sentence Representation Learning

• Our Approach Using Deep Learning

• Summary

Page 14: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Data-Driven Approach to Dialogue

Yes, I will Will you attend

CCIR 2015?

Learning how to converse mainly from data

Page 15: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Data-Driven Approach

• Key Points – Mainly learning from data

– Avoid difficult language understanding problem as much as possible

– Simple fundamental mechanism

• Issues to Investigate – Single-turn dialogue

– Multi -turn dialogue

– Knowledge utilization and reasoning (lightweight)

– Dialogue management (lightweight)

Page 16: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

As Scientific Research Problem

• Not as engineering hacks

• Scientific methodology

– Mathematical modeling

– Positivism

– Reductionism

Page 17: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Outline

• Information Access through Natural Language Dialogue

• Data-Driven Approach

• Single Turn Dialogue

• Sentence Representation Learning

• Our Approach Using Deep Learning

• Summary

Page 18: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

To divide each of the difficulties under examination into as many parts as possible, and as might be necessary for its adequate solution

- Ren´e Descartes

Page 19: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Solving The Problem of Single Turn Dialogue Can Partially Solve the Problem

• U: hello • H: hello, how can I help you? • U: can you tell me how to

find ABC software? • H: please go to this URL to

download • U: how to activate the

software? • H: please see this document

hello

hello, how can I help you

can you tell me how to find ABC software?

please go to this URL to download

how to activate the software?

please see this document

……

Breaking down multi-turn dialogues to single-turn dialogues

Page 20: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Retrieval-based Single Turn Dialogue

Retrieval System

Repository of Message Response Pairs

Learning System

Page 21: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Generation-based Single Turn Dialogue

Generation System

Learning System

Page 22: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

• Retrieval-based approach to single turn dialogue

• 5 million Chinese Weibo data, 1 million Japanese Twitter data

• Registration deadline: Oct 31, 2015

Page 23: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Outline

• Information Access through Natural Language Dialogue

• Data-Driven Approach

• Single Turn Dialogue

• Sentence Representation Learning

• Our Approach Using Deep Learning

Page 24: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Distributional Linguistics

• Distributional semantics

• You shall know a word by the company it keeps - John Firth

• Sylvan?

• Companying words: woods, forest, tree, living, woodland, beautiful, grove

Page 25: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Representation of Word Meaning

cat

kitten

dog

puppy

Using high-dimensional real-valued vectors to represent the meaning of words

Page 26: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Representation of Sentence Meaning

John loves Mary

Mary is loved by John

Mary loves John

Using high-dimensional real-valued vectors to represent the meaning of sentences

New finding: This is possible

Page 27: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Recent Breakthrough in Distributional Linguistics

• From words to sentences

• Compositional

• Representing syntax, semantics, even pragmatics

Page 28: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

How Is Learning of Sentence Meaning Possible?

• Deep neural networks (complicated non-linear models)

• Big Data

• Task-oriented

• Error-driven and gradient-based

Page 29: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Natural Language Tasks

• Classification: assigning a label to a string

• Generation: creating a string

• Matching: matching two strings

• Translation: transforming one string to another

• Structured prediction: mapping string to structure

ts

Rts,

cs

s

'ss

Page 30: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Natural Language Applications Can Be Formalized as Tasks

• Classification

– Sentiment analysis

• Generation

– Language modeling

• Matching

– Search

– Question answering

• Translation

– Machine translation

– Natural language dialogue (single turn)

– Text summarization

– Paraphrasing

• Structured Prediction

– Information Extraction

– Parsing

Page 31: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Learning of Representations in Tasks

• Classification

• Generation

• Matching

• Translation

• Structured Prediction

trs

Rrts,

crs

)(rs

rss '

Page 32: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Deep Learning Tools for Learning Sentence Representations

• Neural Word Embedding

• Recurrent Neural Networks

• Recursive Neural Networks

• Convolutional Neural Networks

Page 33: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Word Representation: Neural Word Embedding

)()(

),(log

cPwP

cwP

10 1 5

2 1

2.5 1

1w

2w

3w

1c 2c 3c4c 5c

7 1 1

2 3

1 1.5 2

1w

2w

3w

1t 2t 3t

TWCM matrix factorization

word embedding or word2vec

W

M

Page 34: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Recurrent Neural Network (RNN) (Mikolov et al. 2010)

the cat sat …. mat

• Variable length • Long dependency: long short term memory (LSTM)

the cat sat on the mat

),( 1 ttt whfh

Page 35: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Recursive Neural Network (RNN) (Socher et al. 2011)

the cat sat on the mat

• Based on parse tree • Auto-encoding and decoding

the cat sat on the mat

Page 36: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Convolutional Neural Network (CNN) (Hu et al. 2014)

• Robust parsing • Shared parameter • Fixed length, zero padding

the cat sat on the mat

……

Page 37: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Outline

• Information Access through Natural Language Dialogue

• Data-Driven Approach

• Single Turn Dialogue

• Sentence Representation Learning

• Our Approach Using Deep Learning

• Summary

Page 38: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Collaborators

38

Zhengdong Lu Baotian Hu Mingxuan Wang

Qun Liu Qingcai Chen Lifeng Shang

Page 39: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Our Approach to Single-Turn Dialogue Using Deep Learning

• Retrieval-based

– Deep Match CNN (Hu et al., NIPS 2015)

– Deep Match Tree (Wang et al., IJCAI 2015)

• Generation-based

– Neural Responding Machine (Shang et al., ACL 2015)

Page 40: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Natural Language Dialogue System - Retrieval based Approach

index of messages and responses

matching

ranking

message

retrieval

retrieved messages and responses

ranked responses

matching models

ranking model

online

offline

best response

matched responses

Page 41: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Deep Match CNN - Archtecture I

• First represent two sentences, and then match

41

Page 42: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Deep Match CNN - Architecture II

• Represent and match two sentences simultaneously

• Two dimensional convolution and pooling

42

Page 43: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Deep Match Tree • Constructing deep neural network, with first

layer corresponding mined patterns

43

Page 44: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Retrieval based Approach: Accuracy = 70%+

上海今天好熱,堪比新加坡。 上海今天热的不一般。 想去武当山 有想同游的么? 我想跟帅哥同游~哈哈

It is very hot in Shanghai today, just like Singapore . It is unusually hot. I want to go to Mountain Wudang, it there anybody going together with me? Haha, I want to go with you, handsome boy

Page 45: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Natural Language Dialogue System - Generation based Approach

• Encoding messages to intermediate representations

• Decoding intermediate representations to responses

• Recurrent Neural Network (RNN)

Message

Response

Encoder

Txxx 21x

Decoder

tyyy 21y

c

h

Generator

Page 46: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Neural Responding Machine

用了

都说

???

华为 手机 怎么 样

Attention Signal

Co

ntext G

en

erato

r

Global Encoder

Local Encoder

Decoder

140 million parameters

Page 47: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Generation based Approach Accuracy = 70%+

占中终于结束了。 下一个是陆家嘴吧? 我想买三星手机。 还是支持一下国产的吧。

Occupy Central is finally over. Will Lujiazui (finance district in Shanghai) be the next? I want to buy a Samsung phone Why not buy our national brands?

Page 48: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Demo: Neural Responding Machine

Page 49: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Advantages and Disadvantages

• Advantages

–Powerful modeling capability

–No human intervention

• Disadvantages

–Black box

Page 50: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Open Questions and Future Work

• How to incorporate knowledge

• How to combine with logic

• How to extend to multi-turn dialogue

Page 51: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Outline

• Information Access through Natural Language Dialogue

• Data-Driven Approach

• Single Turn Dialogue

• Sentence Representation Learning

• Our Approach Using Deep Learning

• Summary

Page 52: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Take-away Messages

• Natural language dialogue = new paradigm for information retrieval

• Data-driven approach is key

• Current focus is single turn dialogue

• NTCIR STC task: call for participations

• Sentence representation learning is possible

• Progress being made on both retrieval-based and generation-based single-turn dialogue with deep learning and big data

Page 53: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

References • Mikolov, Tomas, Martin Karafiát, Lukas Burget, Jan Cernocký, and Sanjeev

Khudanpur. Recurrent Neural Network based Language Model. INTERSPEECH 2010, 2010.

• Socher, Richard, Cliff C. Lin, Chris Manning, and Andrew Y. Ng. Parsing natural scenes and natural language with recursive neural networks. ICML-11, pp. 129-136. 2011.

• Baotian Hu, Zhengdong Lu, Hang Li, Qingcai Chen. Convolutional Neural Network Architectures for Matching Natural Language Sentences. NIPS'14, 2042-2050, 2014.

• Mingxuan Wang, Zhengdong Lu, Hang Li, Qun Liu. Syntax-based Deep Matching of Short Texts. Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI’15), 2015.

• Lifeng Shang, Zhengdong Lu, Hang Li. Neural Responding Machine for Short Text Conversation. ACL-IJCNLP'15, 2015.

Page 54: Natural Language Dialogue - Future Way of Accessing ... › uploads › 3 › 1 › 6 › 8 › 3168008 › ccir_2015_hangli.pdfNatural Language Dialogue - Future Way of Accessing

Thank you!