Post on 10-Feb-2016
description
Transforming Contact Centers with Speech and IP
Jack Chase, Director of Product Management , NMSRob Kassel, Senior Manager, Network Speech Products, Nuance
Slide 2
Agenda The Evolution of Contact Centers
Business trends Architectures
Speech Technology Update — Rob Kassel, Nuance MRCP-enabled speech
www.nmscommunications.com
Slide 4
Single anddistributed sites
Some use of IVRU and ACD
Screen pops
Some call routing via ACD
Evolution of Contact Centers: Business Trends
First Generation Second Generation Third Generation
Hardware-basedCost Center
Integration andTechnology
Solving BusinessProblems:Profit Center
Stand-alone sites
Limited PBX routing
Customer talks into phone Agent types into computer
Virtual Call Center
IVRU & ACD integration
Multi-media access: Email, fax, web
Integrated ERP/CRM
Skills-based routing
www.nmscommunications.com
Slide 5
The Obvious Cost Savings Target
Agent Costs66%
Telecom Costs15%
Outsourced Calls7% Technology
12%
Source: Benchmark Portal, 2002
www.nmscommunications.com
Slide 6
The Cost of Customer Interactionis Reduced with Self Service
Web
IVR
Chat Phone
$0.24 $0.45
$5.00
$7.00$5.50
$0.00
$2.00
$4.00
$6.00
$8.00
$10.00
$12.00
$14.00
$16.00
Assisted Service
Self-Service
$40
Source: Gartner Group, 2002
www.nmscommunications.com
Slide 7
Evolution of Contact Centers: Technology Trends Self-service using web, ASR and TTS is
reducing the dependency on live agents; costs Web, email, and messaging are freely mixed
with phone calls in a single queue Network based contact centers are becoming a
significant phenomenon VoIP is lowering system costs at the agent and
between system components By 2007, 30% of contact center agents will be on VoIP
www.nmscommunications.com
Slide 9
VoIP in an IP Contact Center
Site A
Site BIP-PBX
CRM
Contact Center(ACD+CTI
+IVR+Speech)
PSTN
Self-Service
Operations Center
VOIP
CircuitDataVOIP
www.nmscommunications.com
Slide 10
Upgrading with MRCP and VXML
Site A
Site BIP-PBX
CRM
PSTN
Operations Center
CircuitDataVOIP
Media Server
Application Server
VXML Server
Speech Server
MRCP
RTP
SIP, CCXML
VXML
www.nmscommunications.com
Slide 11
Speech Technology Update
Rob Kassel, Senior Manager, Network Speech Products, Nuance
www.nuance.com
Slide 12
The Need For Speech Recognition DTMF often is used for customer self-service
Numeric entry is easy… unless you are reading Spelling entry is more difficult Menus need to be enumerated, can’t be too long Deep menu structure becomes tiresome Assignment inconsistent between vendors (e.g., voicemail) How do you enter “5 ½%” or “Albuquerque”?
With speech, questions are answered naturally Caller satisfaction is higher Fewer zero-outs leads to additional cost savings
www.nuance.com
Slide 13
Speech Recognition Process
FeatureExtraction
SpeechDetector
ConfidenceScoring
Speech
Results
Grammar
GrammarCompiler
SystemDictionary
PronunciationRules
PhonemeClassifier Acoustic
Models
Search
www.nuance.com
Slide 14
Speech Recognition Challenges Processor and memory demands Speech can be difficult to decode, even for humans
Fixed, confusable vocabularies: “B-C-D-E-G-P-T-V-Z” Ambiguous boundaries: “It’s hard to wreck a nice beach!”
Speaker variability: dialect, volume, gender, etc. Noise rejection: hands-free, mobile, telematics Out-of-vocabulary rejection & confidence measures Callers don’t always say what you might expect…
Yes or no?
www.nuance.com
Slide 15
Speech Recognition: State of the Art Callers speak naturally in directed dialogs High accuracy, infrequent confirmation Million-word vocabularies:
stocks, proper names, street addresses Scripting to control values returned to application:
“half past three” can return “1530” or “afternoon” Open-ended responses, especially for call routing
Allows for questions like “How may I help you?” Based on statistical methods trained from examples
www.nuance.com
Slide 16
The Need For Text-To-Speech Professional recordings best for fixed content Word concatenation is difficult to do well
Often used for numeric output Can sound mechanical; irritating when frequent
Large output vocabularies fairly common(e.g. city names)
Some applications defy recordings(e.g. messaging)
www.nuance.com
Slide 17
TTS Text Analysis
PronunciationGeneration
TextNormalization
Source Text
Annotated Text
SystemDictionary
PronunciationRules
ProsodyGeneration
“Are you there?” are + you + there + <question>$31 thirty one dollarsATM eh tee em NATO nay-tohA.M. eh em CUL8R see you later
HomographDisambiguation
minute = 60 seconds minute = tinyDr. Jones doctor jones Jones Dr. jones drive11210 eleven thousand two hundred ten (number)11210 one one two one oh (ZIP code)
Determine which words require emphasisInsert pauses based on phrase boundaries, lung capacityAssign duration, pitch, and volume to each phoneme
www.nuance.com
Slide 18
TTS Waveform Generation
Can mimic natural speech if parameters are set by hand
In practice sounds somewhat robotic, the “drunken Swede”
Can produce a variety of voicesExtremely compact
Units can be smaller or larger than a phoneme
Database tends to be very largePreserves speaker characteristics
and speaking style of voice talent
Annotated Text
Speech
VoiceDatabase
UnitSelection
Concatenateand Smooth
Annotated Text
Speech
ParameterGeneration
Vocal TractModel
Parametric Concatenative
FEMALE FEMALE CHILDwww.nuance.com
Slide 19
Text-to-Speech: State of the Art Naturalness of concatenative TTS is generally
preferred for call center applications …but voice talent takes direction, more expressive Custom voices to maintain brand identity Use one voice talent for both recordings and TTS
Seamlessly mix dynamic data with static prompts Apply prompt “patches” rapidly until
cost of recording session can be justified
www.nuance.com
Slide 20
Designing Speech Applications Observe & interview call center agents Listen to calls, develop caller profiles
Who are they? What do they know? Where are they calling from? What are their goals? What are their priorities?
Determine business objectives & rules Define speech user interface
Call flows Prompt wording Error recovery; help and instructions Anthropomorphism and persona
www.nuance.com
Slide 22
What is MRCP v1?
Speech servers are connected by VoIP to IVR servers Standard API for ASR and TTS Easy to reconfigure system as needs change Easy to implement redundancy
Control: MRCP/ RTSP/ TCP/ IP
Speech: G.711/ RTP/ UDP/ IP MRCP Server
Speech
ServersIP
PSTN IVR
ServersIVR
ServersSpeech
Servers
www.nmscommunications.com
Slide 23
Natural Access and MRCP
Service Managers, Libraries
Driver Driver Driver IPC
Call Control
CX Boards AG Boards CG Boards PacketMediaHMP
SNMP
HMP
PCI PCI PCI IP
IVRServices
PSTNTrunking
VoIP(Fusion)
Conferencing
FaxServices
USAI(MRCP)
OAM
VideoAccess
www.nmscommunications.com
Slide 25
Current Support for Universal Speech Access
Vendor Type Universal Speech Access 1.0
Universal Speech Access 1.1
Nuance ASR MRCP Server SP5 Nuance 8.5
MRCP Server SP7 Nuance 8.5
Nuance(ScanSoft)
ASR OSMS 2.0.1OSR 2.0
SWMS 3.1OSR 3.0
Nuance TTS Vocalizer 3.0 Vocalizer 3.0.8
Nuance(ScanSoft)
TTS OSMS 2.0.1Speechify 2.0
SWMS 3.1RealSpeak 4.0
Telisma ASR Philsoft 3.2 teliSpeech 1.0 SP4
Loquendo ASR N/A Loquendo ASR LSS 6.0
www.nmscommunications.com
Slide 26
What’s Next for MRCP? MRCP v2
draft-ietf-speechsc-mrcpv2-06, Feb 20, 2005 Adds SIP/SDP for session setup
Replaces RTSP Adds support for speaker verification Little deployment yet NMS will update USAI when deployments occur
www.nmscommunications.com