Download - Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Email Processing

and Question

Answering System

(EPQAS)

Date: 29 March 2018

Rafael E. Banchs Human Language Technology Unit

HOW IT STARTED

• Agency receives a high volume of emails that need to be dealt with on a daily basis demanding significant amount of resources and long response times

• Main objectives: to use state-of-the-art natural language processing technologies to

1. Reduce the volume of incoming emails by supporting advance FAQ online support at agency’s website

2. Automatically redirect the incoming emails to the appropriate officer or group

3. Provide officers with pre-selected responses based on similar past email responses

Problem Statement and Objectives

Agency Website

Automated FAQ system

Query Resolved?

YES

Email classifier

NO

Use

r Se

nd

s Em

ail

Group 1 Group N …

Proposed responses

Proposed responses

…

… Officer 1 Officer N

USER

Frequently Asked Questions (FAQ) Service Engine

Email classification and Response Recommendation Engine

Proposed Solution

WHAT WE FACED

Phonetics: system of sounds

Morphology: forms and words

Syntax: clauses and sentences

Semantics: conveying of meaning

Pragmatics: meaning in context

Communication

Means

Pursued Goals

Intentions

Physiological

Capabilities

Abstract Cognitive

Faculties

Levels of Linguistic Phenomena

• The process of transforming a natural language

statement into a semantic representation (frame):

• Subtask 1: Intent Detection

• Subtask 2: Entity Extraction

Natural Language Understanding





Ok, I will meet you in Starbucks at 6pm







Intent Detection: Confirm_meeting







Intent Detection: Confirm_meeting

Entity Extraction: Action: Meet

Place: Starbucks

Time: 6:00pm


• Transactional natural language applications

Detected Intent: Confirm_meeting

Extracted Entities: Action: Meet

Place: Starbucks

Time: 6:00pm Semantic Frame

Execute Transaction

for instance… Update_calendar

Transactional vs Informational

• Transactional natural language applications

• Informational natural language applications

Detected Intent: Confirm_meeting

Extracted Entities: Action: Meet

Place: Starbucks

Time: 6:00pm Semantic Frame

Execute Transaction

for instance… Update_calendar

Detected Intent: Ask_for_location

Extracted Entities: Action: Go_to

Place: Starbucks

Location: ??? Semantic Frame

Search for Information

for instance… Find_address_in_map

Transactional vs Informational

• Q&A is typically an informational application

• There are two different approaches, depending on the type of information available:

• Question search: matching intents and entities over a database of available question answer pairs (FAQs).

• Response selection: matching intents and entities over a collection of statements that might contain the answer.

Question Answering

NLU in Question Answering

• Intents have to be identified among different language constructions:

Do you take credit cards?

Can I make the payment with visa?

Detected Intent: Ask_about_CC_accepted

NLU in Question Answering

• Intents have to be identified among different language constructions:

• Entities have to be identified among different references:



Detected Intent: Ask_about_CC_accepted



Extracted Entities: Action: Payment

Type: Credit_Card

Accept: ???

Problems of Discrete Representation

Consider the following sequences of words



When can I make the payment for tourist visa application?






SEMANTICALLY RELATED

WORD OVELAP = 0%







WORD OVELAP = 0%

NOT SEMANTICALLY

RELATED

WORD OVELAP = 70%

Properties of Continuous Spaces

The Distributional Hypothesis

“a word is characterized for the company it keeps” (Firth 1957)

meaning is mainly determined by the context rather than from individual language units

• Continuous spaces represent semantic similarities by means of

the geometric concept of proximity

• Offer much “better” smoothing capabilities

• Not constrained to the Markovian assumption

Similarity in Continuous Space







(SHORT DISTANCE) Do you take credit cards?





(SHORT DISTANCE)

NOT SEMANTICALLY RELATED

(LONG DISTANCE)



Building Continuous Space Models

1.- Train a deep learning network on a supervised task

W1

W2

W3

…

Wn

Input S

ente

nce

L e v e l s o f A b s t r a c t i o n

C1

C2

…

Ck

Superv

ised T

ask

Building Continuous Space Models

1.- Train a deep learning network on a supervised task

2.- Use some of its internal layer representations as continue space models for language

W1

W2

W3

…

Wn

Input S

ente

nce

L e v e l s o f A b s t r a c t i o n

C1

C2

…

Ck

Superv

ised T

ask

M O D E L

Semantic Maps of Words

BIRD

GOAT

SKY

LIGHTNING

THUNDER

RAIN FIELD

FLOCK

SHEEP MOUNTAIN

SEA

CLOUD

WIND

FISH

RIVER

STORM

(Banchs 2012)


Non-living things Living things

BIRD

GOAT

SKY

LIGHTNING

THUNDER

RAIN FIELD

FLOCK

SHEEP MOUNTAIN

SEA

CLOUD

WIND

FISH

RIVER

STORM

(Banchs 2012)


Water

Land

Sky

Non-living things Living things

BIRD

GOAT

SKY

LIGHTNING

THUNDER

RAIN FIELD

FLOCK

SHEEP MOUNTAIN

SEA

CLOUD

WIND

FISH

RIVER

STORM

(Banchs 2012)

Regularities as Vector Offsets

Kings

(Mikolov 2013)

King

Queen

Queens


Kings

(Mikolov 2013)

King

Queen

Queens gender offset singular/plural offset


Kings

(Mikolov 2013)

King

Queen

Queens

Kings – King ≈ Queens – Queen

gender offset singular/plural offset

Queens ≈ Kings – King + Queen

Regularities across Languages

(Hermann 2014)

Days of the Week Months of the Year

GERMAN FRENCH ENGLISH

GERMAN FRENCH

THE SOLUTION

Agency Website

Automated FAQ system

Query Resolved?

YES

Email classifier

NO

Use

r Se

nd

s Em

ail

Group 1 Group N …

Proposed responses

Proposed responses

…

… Officer 1 Officer N

USER

Frequently Asked Questions (FAQ) Service Engine

Email classification and Response Recommendation Engine

QU

ESTI

ON

SEA

RC

H P

RO

BLE

M

RES

PO

NSE

SEL

ECTI

ON

PR

OB

LEM

Proposed Solution Revisited

Overall FAQ System Architecture

User Query

Query

Type?

Keyword-based

Keyword

based Search

Query

Processing

Natural

Language

Query

Continuous

Space Model 1

Continuous

Space Model 2

In-domain

Bag-of-Words

Re-ranker

Control Engine:

.- High confidence: single response

.- Medium confidence: multiple responses

.- Low confidence: suggest to send email

.- Out-of-domain: institutional response

In/out domain

Classifier

FAQ

Database

Overall Email System Architecture

User Email Email

Classification

Email

Segmentation

Re-ranker

Interactive Interface:

Original user email +

List of pre-selected

responses

Sentence-level

Classification

Historical

Email DB

In-domain

Resources

Query

Processing

Entity

Extraction

Pre-selection of Responses

System Update

Outcome and Output Indicators

Website

FAQs

Email

Enquiry

Officer

Response

USER USER

C u s t o m e r J o u r n e y - V a l u e C h a i n


1. Reduction of incoming email volume (10%-20% less) • User finds more information in the website, and faster • Lower average number of emails per day sent to agency

Website

FAQs

Email

Enquiry

Officer

Response

USER USER 1




2. Reduction in human effort (20%-30% less) • Less human effort to re-route and reply to emails • Larger volume of emails processed per time unit

Website

FAQs

Email

Enquiry

Officer

Response

USER USER 1 2




2. Reduction in human effort (20%-30% less) • Less human effort to re-route and reply to emails • Larger volume of emails processed per time unit

3. Reduction on email response time (20%-30% less) • Faster internal processing of emails • Lower average response time to the user

Website

FAQs

Email

Enquiry

Officer

Response

USER USER 1 2 3 Response Time


Thank you

www.a-star.edu.sg/I2R www.facebook.com/i2r.research

Rafael E. Banchs Human Language Technology Unit

http://www.a-star.edu.sg/I2R



http://www.facebook.com/i2r.research