Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

35
® Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

description

Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006. Command & control. Transactional. Value. Value. Commerce. Devices. Ease of use Speed, efficiency Extended reach. Lower cost Increased cust sat From cost to revenue. Information Access. - PowerPoint PPT Presentation

Transcript of Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

Page 1: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

®

Speech Technology Opportunities and Challenges

David NahamooSpeech CTO, IBM ResearchDec 12, 2006

Page 2: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

2

Needs for Speech Technology

Lower cost

Increased cust sat

From cost to revenue

Ease of use

Speed, efficiency

Extended reach

Integration of voice/video with

enterprise data

Indexing of large amount of

multimedia info

Breaking language barriers

Accessibility

Value Value

Value

GLOBAL ACCESS

AUTOMATIONUSABILITYMultichannelSelf-Service

MultimodalInteraction

MultimediaAnalytics

Devices Commerce

Information

Command & control

Dictation

Information

Access

Transactional

Problem solving

AccessibilityMultilingual

communication

VoiceWeb

Transcription

Page 3: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

3

Commerce

– Contact Centers

– Unified Communication

Global Access

– Speech To Speech Translation

– Translingual MultiMedia Mining

– Accessibility

Devices

– Automotive

– Set Top Box

– Mobile Phones

Major Speech Application Opportunities

Page 4: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

4

Speech Technology Innovation that Matters

• Conversational Interaction – Dealing with Complexity

• Speech Analytics – Extracting Insight / Knowledge

• Multilingual Dimension – Globalization

Page 5: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

®

Contact Centers Of Future

Page 6: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

6

Contact Centers face a number of challenges as they attempt to balance costs, customer experience and revenue growth

Too much focus on Cost ReductionToo much focus on Revenue

GrowthToo much focus on the Customer

Experience

PoorCustomer

Experience

RisingCosts

RisingCosts

Limits onRevenue Growth

PoorCustomer

Experience

Limits onRevenue Growth

Can actually lead to… Can actually lead to… Can actually lead to…

1. Cost Reduction/Containment

2. Customer Experience Improvement

3. Revenue Growth

BUT… BUT… BUT…

Differing emphasis can be placed on each one, but unless managed carefully and balanced effectively for the business, the effects can be disastrous…

Page 7: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

7

Contact Centers – Logical Components and Focus AreasD

ata

Ser

vice

s

VoIP GatewayPublic Internet Managed IP Network

Network Services

Channel Services - Assisted

Web Voice Chat

Email Voicemail Video

Agent Services

Agent Desktop

Routing

Skills WFM

Web

Vo

ice

Ch

at

Vo

icem

ail

Em

ail

Outbound

Services

Dialer

Presence

PortalChannel Services – Self-service

Back-end business processes, applications and information services (internal and external)

ERP SCMCRM

Mail

Fax

Universal Queue

VoiceCallback

ODS ECM

Contact Services

Systems Information Analytics

QAM KPIsMDM EDW

Alerts Dashboards

RTAKM

UIM Search

Page 8: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

8

Dat

a S

ervi

ces

VoIP GatewayPublic Internet Managed IP Network

Network Services

Channel Services - Assisted

Web Voice Chat

Email Voicemail Video

Agent Services

Agent Desktop

Routing

Skills WFM

Web

Vo

ice

Ch

at

Vo

icem

ail

Em

ail

Outbound

Services

Dialer

Presence

PortalChannel Services – Self-service

Back-end business processes, applications and information services (internal and external)

ERP SCMCRM

Mail

Fax

Universal Queue

VoiceCallback

ODS ECM

Contact Services

Systems Information Analytics

QAM KPIsMDM EDW

Alerts Dashboards

RTAKM

UIM Search

Self Service

Information Integration Analytics

Agent Performance

RevenueGrowth

Multi-channel Access

Contact Centers – Logical Components and Focus Areas

Page 9: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

®

Self-Service

Page 10: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

10

Increased Self-Service Self-service to 80% levels and higher is possible in at least some centers

– Today’s contact centers are typically 10 to 20% self-service in most industries, but at least some companies claim 80% self-service now where Web-based interaction predominates; when voice predominates, numbers are much lower

– Live-agent costs are an order of magnitude higher than the costs of self-service

Self-service adoption has been slow to take off (8% growth 2003-2005)

– Self-service is more challenging technically than agent performance because of the difficulty of achieving high customer satisfaction

– Self-service is often run by another group than the one that runs the contact center

– Self-service will be the end-game as labor-arbitrage becomes increasingly more difficult

Whichever vendor develops ways to drive self-service fastest (while maintaining customer satisfaction) will have a commanding position in the marketplace

– Self-service is clearly a huge cost-savings opportunity

Page 11: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

11

Customers prefer the convenience and control aspect of self-service, and have high expectations

Customers prefer self-service– Self-service preferred for many types of customer

contact• Viewing Bill (42%); • Checking Minutes (44%); • Checking/Changing a Talkplan/package (37%); • Subscribing to Services (38%)

– Web preferred to the phone (50%)• Provided one can obtain answers in the same amount of

time

And their expectations are very high– Ease of use

• 86% indicate they would stop using an organization if their IVR was difficult to use

– High level of service• 82% indicate lower level of service via the web

unacceptable– Majority indicate they would abandon a web

transaction or go to a competitor due to usability issues.

Source: Fujitsu Consulting and Netonomy, Modalis Research Technologies, Genesys, Inc., Harris Interactive

Mobile Bank Health insurance Householdinsurance

Veryimportant

Quiteimportant

Not veryimportant

Not at allimportant

60%

20%

40%

0%

20%

40%

60%

“How important was the ability to serve yourself (as a customer) in your decision to use the service provider in the first place?”

Page 12: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

12

IVRs are still the dominant self-service channel and they are increasingly becoming speech-enabled

IVRs are still the dominant contact channel

– 45% of contact starts in IVR channel

IVRs are becoming speech-enabled

– Speech-enabled IVRs support more complex functionality and higher completion rates

• Well-designed voice user interfaces (VUI) can reduce call time by as much as 30% and compared to traditional IVR systems and cut opt-out rates by 50% (Forrester - 2003)

• Increased IVR retention rate. Companies are up to 60% more likely to retain a caller within the IVR using speech vs. touch tone (Giga)

Page 13: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

13

Conversational Interaction

Should support the gap between user mental model and the application model– Task Complexity

– User Familiarity

– User Patience

Should minimize the user effort and task completion time– Consistent

– Rapid

– Efficient

Page 14: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

14

Conversational Solutions

LOW MEDIUM HIGH

COMPLEXITY

STOCKTRADING

PACKAGETRACKING

FLIGHTRESERVATION

BANKING

CUSTOMERCARE

TECHNICALSUPPORT

INFORMATIONAL TRANSACTIONAL PROBLEM SOLVING

STOCKQUOTE

FLIGHTSTATUS

TRAVEL

CALL ROUTING

Page 15: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

15

User and application models match,

Time not a factor, No decision makingPACKAGETRACKING

STOCKQUOTE

String of numbers & characters + checksum ASR

Large list of names and symbols ASR

User model is close to application,

Some decision making, Time not a factor

BANKING

STOCKTRADING

Directed Dialog and limited syntax NLU

Page 16: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

16

User’s model might not match application’s,

Involved decision making, Time a factor

User and application models match,

No decision making, Long list of concepts

CALL ROUTING Substantial Language Understanding

Substantial Dialog (& Language Understanding)TRAVEL

Page 17: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

17

Conversational Help Desk ChallengesHelp Desk is the most complex of all three types of conversational speech applications

Complexity is based on Nature of the Call• User domain model is limited at best • User is usually upset• Complex dialog and language understanding

Current Market Solution• No Industry “best practices” have been established

Page 18: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

18

Main MenuWorkstation, host, password, business app, telephone

Overview of IBM Help Desk

Agent handles97-99.5% of calls

80% Serviceis HOWTO

Self service(telephone)0.5% to 3%

Incoming Calls

Troubleshooting

Create Trouble Ticket

Password Reset

Self Help (FAQ/HOWTO)

Not-entitled

Introducing ( )Audrey

Page 19: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

®

Speech Analytics

Page 20: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

20

Contact Center AnalyticsContact Points

Branch office

Web

IVRCall Center

CustomerEnterprise

Products& Services

Integrate & Analyze Structured& Unstructured Data

Unstructured

Call logs & transcriptsEmails, Surveys

Self ServiceAgent

Structured

Customer/Product Transaction Data

Instant Market Intelligence Customer preferences

Dissatisfaction Drivers

Lifetime Value Management

Analyze Agent Performance

Improve C-Sat, Upsell Rate

Analyze Contact Drivers

Improve FAQs, Web pages

Structured

Agent Data

Analytics enhances value for:

Self-service

Agent performance

Cross-sell/up-sell

Transformational Diagnostics

Business Intelligence

Marketing

……

Page 21: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

21

Call Center Operation Quality

Millions of Calls Everyday Want general information:

– Are callers happy?

– Are processes followed?

– What are people asking for?

– What is the trend of occurrence of known problems?

– Are there new problems?

Need to know where to take action:

– Save a customer from defecting

– Apologize for mishandled calls

– Show call to agent for coaching

– Follow up on a missed sales opportunity

Currently Human monitoring is necessary for these

things Only a small fraction of calls can be

checked Most checking is wasted There is no permanent record of the

calls

Page 22: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

22

Speech Analytics for Automated Quality MonitoringBackground:

– IBM NA call center team listens to and evaluates ~1% of all calls

– 35 questions answered

• “did the agent use courteous words and phrases?”

• “did the agent speak in an appropriate tone?”

• “did the agent follow the closing procedure?”

• “did the agent solve the problem?”

– Mostly random calls, rarely interesting

– Typical of all call centers

CallRank Quality Monitoring Application:

– Monitor 100% of calls

– Answer questions and assign default ratings

– Provide a ranked list to human monitors to focus attention on bad calls

Websphere

CallRank Calls & Stored

Analysis

Turn Audio into Text

Store Analysis and Transcripts back

into CM/DB2 Transcribed & Analyzed audio

CM

Extract audio from CM/DB2

Evaluate Calls

Collection Reader

Analytics Annotators

CAS Consumer

Speech-to-Text Annotator

UIMA Processing Pipeline

IBM Call Centers

IBM Call Centers

From CM

audio

Page 23: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

23

Example of a good call

Page 24: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

24

Example of a bad call

Page 25: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

25

Automated Quality Monitoring

• Status: Three times as many bad calls found for same listening effort Processing ~ 3000 calls/day now from all North American centers

• Technology: Answer many questions with pattern matching on decoded text

Did the agent follow the appropriate closing script? Search for “THANK YOU FOR CALLING”, “ANYTHING ELSE”,

“SERVICE REQUEST” Use other linguistic cues to improve the accuracy of the system

Number of hesitations (UH, UM, HUM, etc), total silence, longest silence, …

Page 26: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

®

Agent Performance

Page 27: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

27

Agent Performance Personnel costs are by far the largest component of existing contact center costs

– Move to off-shore operations has resulted in significant (up to 75%) labor cost reductions

– Large contact centers have very large numbers of personnel

– Estimated 6M agents in U.S. in 2004 and continuing to grow

Even with the rise of self-service, a percentage of calls will still be handled by live agents

Numerous opportunities exist to improve performance by automation:– Integration of systems across the business for use in the contact center

– On-boarding process (e.g., accent monitoring)

– Training (on-boarding, continuing education, real-time training)

– Agent quality monitoring

– Call logging (30% of agent time in some contact centers)

– Helping the agent find the answer to the customer’s question

– Workforce management

– Intelligent call routing globally

– Expert “multi-channel” agents

– Activity-centric computing and other collaborative projects

Page 28: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

28

Agent Performance: Voice Assessment/TrainingIncreased number of off-shore centers

e.g., India (>50% growth)

Key focus in off-shore contact centers

Hiring– Shrinking candidate pool and high agent attrition

rates

Training– Train agents to have neutral accents to improve

customer experience

Voice Assessment/Training System

Candidate screening for– Grammar– Pronunciation – Spoken language comprehension

Accent training– Correctness of pronunciation, intonation,

speaking rate and syllable stress

Page 29: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

29

Contact Centers Summary Contact centers are focal points in an enterprise from which all customer

contacts are managed

Contact Centers face a number of challenges as they attempt to balance costs, customer experience and revenue growth

Customers increasingly prefer self-service and speech self-service is now ready for prime time

Enterprises can achieve improved agent performance with agent productivity tools and agent hiring/training tools

Enterprises should focus on revenue growth transforming their contact centers from cost centers to profit centers

Customer demand for choice, convenience and consistency is driving the adoption of multi-channel enablement in contact centers

Actionable intelligence from real-time and offline analytics of structured and unstructured customer interaction data will lead to new opportunities for cost reduction, revenue growth and improved customer experience

Page 30: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

®

Increasing Global Reach

Page 31: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

31

Global Language Barriers

Different languages spoken by people living in different regions or even by different ethnic groups living in the same region

Language barriers cause…– High cost for agents – need both subject matter expertise and

language skills • Call centers, insurance agents, etc.

– Unreachable to broad international business or tourism travel market

– Life threatening in • medical emergency• natural disaster situations• military

– Multilingual on demand media and entertainment

Page 32: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

32

Data Point: Online population language Mismatch

*Global Internet Statistics (http://www.glreach.com/globstats/index.php3)

Mismatch:

Diversity of languages spoken online increasing, yet language of web pages are consolidating

**

Page 33: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

33

Informational &

Transactional

Multimodal Multimedia Translingual Access

Machine Translation

SpeechRecognition

Information Analytics

Video

Audio

Multimedia Analytics

Translingual Analytics

Multimodal Access

Translingual Access

Multimodal Translingual

Access

Multimedia Translingual

Analytics Image

Text

OCR

Text mining, Categorization, Taxonomies, Entity extraction, Entity relation, Ontology, …

Content

Context

Transcription, biometrics, …

Page 34: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

34

S2S Translation call for innovation Speech Recognition Challenges

– Needs to work in noisy environments, with spontaneous, conversational speech in multiple languages, could be emotional speech when under stress.

Translation has to handle output of ASR system

– Recognition errors

– Spoken language: different from written language• Non-grammatical disfluencies• Imperfect syntax• Lack of formal characteristics of text: no punctuation or paragraphing

Translated text must be "speakable" for oral communication

– not enough to translate content adequately; output must be fluent

– Need to carefully consider and tune interactions between ASR, MT and NLG – need access to all components

Cost-effective development of new languages and domains

Intonation translation remains a grand challenge

Page 35: Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

35

Speech Technology Driving New Business Opportunities

• Increasing Self Service: More natural interaction with more difficult tasks is made possible

• Increasing Agent Productivity, Monitoring Quality, and Increasing Sales Opportunity: Extracting insight from the content of conversation

• Increasing the Global Reach: Breaking the language barrier