Media Manager Mail Access Unified Messaging

29
Media Manager Mail Access Unified Messaging Barbara Hohlt UC Berkeley Ericsson Presentation August 22, 2000

description

Media Manager Mail Access Unified Messaging. Barbara Hohlt UC Berkeley Ericsson Presentation August 22, 2000. Desktop. Pager. MediaManager Mail Access. Cell-Phone. PSTN Phone. Messages from many sources. ???. Project Overview. Make messages more accessible Get all types of messages - PowerPoint PPT Presentation

Transcript of Media Manager Mail Access Unified Messaging

Page 1: Media Manager Mail Access Unified Messaging

Media Manager Mail AccessUnified Messaging

Barbara Hohlt

UC BerkeleyEricsson Presentation August 22, 2000

Page 2: Media Manager Mail Access Unified Messaging

Messages from many sources

PSTN Phone

Cell-Phone Desktop

Pager

MediaManager Mail Access

Page 3: Media Manager Mail Access Unified Messaging

Project Overview

• Make messages more accessible– Get all types of messages– Access from different devices with different

capabilities– Enable faster browsing of many voicemails

• Media Mail services– A unified messaging infrastructure– Voicemail is email encoded in MIME

• Transcoding services– Enhance voicemail interaction– Includes: skimmed audio, transcript, text/audio

summary, and outline

Page 4: Media Manager Mail Access Unified Messaging

Related Work

• Universal Inboxes/Unified Messaging– onebox.com– CoolMail.net– Lucent/Octel Unified Messenger– Stanford Mobile People Architecture

• Audio Content Extraction Techniques– SpeechSkimmer, MIT’s MultiMedia Lab [Arons95]– Auto-Summarization, Microsoft Research– CueVideo, IBM

Page 5: Media Manager Mail Access Unified Messaging

Architecture

Transcoder Service•Voicemail->Text Transcript

•Voicemail->Text Summary

•Voicemail->Text Outline

•Email ->Plain Audio

•Email -. GSM Audio

•Voicemail -> GSM Summary

•Voicemail->Audio Summary

•Voicemail->Skimmed Audio

Mail Access Interface

NinjaMail

Client Folder Store

Client Client

Mail Access Interface

POP

Mail Access Interface

IMAP

Media Manager Interface

Media Manager Service

Page 6: Media Manager Mail Access Unified Messaging

Applications

• Conventional GUIs• Context-Aware Applications• Iceberg Universal Inbox Component

Desktop

MediaManager Mail Access

A conventional desktop gui can contact the Media Manager directly and request messages as text.

The Media Manager will return emails and voicemails as text.

Page 7: Media Manager Mail Access Unified Messaging

Context-Aware Application

Palm Device

Desktop

Redirection Proxy

Redirection Proxy

11palm device asks for a

list of messages as text and selects a voicemail

22

requests a redirection from the proxy, which forwards the redirection request to the

desktop

33

desktop asks for the voicemail and plays it

MediaManager Mail Access

Page 8: Media Manager Mail Access Unified Messaging

Bhaskar’s Cell-Phone

Automatic Path Creation Service

800-MEDIA-MGR UID: [email protected]

Naming Service

11

Preference Registry

mediamgr: Cluster locn.22

33

Barbara’s PSTN Phone

Universal Inbox

Iceberg Universal Inbox

MediaManager Mail Access

Page 9: Media Manager Mail Access Unified Messaging

Architecture

Transcoder Service•Voicemail->Text Transcript

•Voicemail->Text Summary

•Voicemail->Text Outline

•Email ->Plain Audio

•Email -. GSM Audio

•Voicemail -> GSM Summary

•Voicemail->Audio Summary

•Voicemail->Skimmed Audio

Mail Access Interface

NinjaMail

Client Folder Store

Client Client

Mail Access Interface

POP

Mail Access Interface

IMAP

Media Manager Interface

Media Manager Service

Page 10: Media Manager Mail Access Unified Messaging

MediaManagerServiceIF• getFolders( ) and getFoldersAs( )

– Given a username, returns a list of folder names– Returns the list as audio or gsm

• getList( ) and getListAs( )– Given a username, foldername, and count– Returns a list of messages (sendername, title, date)– Returns the list as audio or gsm

• getMessage( )– Given a Message Ref, returns the entire message

• getMessageContent( )– Given a Content ID and return type– Returns one part of the message as the return type

Page 11: Media Manager Mail Access Unified Messaging

• Media Message– Media Reference id– Array of Content Objects

• Content Object– Content ID– Data

• Content ID– Media Reference id– Content Part index– Content Type

Messages and Content Objects

Page 12: Media Manager Mail Access Unified Messaging

Interface Example

MediaManager Mail Access

• User asks for list of messages as GSM• Media Manager returns a list of message

headers• Cell Phone sends a Content ID back• Media Manager sends a voicemail Content

Object

Cell-Phone

Media Message Header

Content Object

Content ID

Page 13: Media Manager Mail Access Unified Messaging

Audio Tools• Speech Recognition/Synthesis

– Transcribe voicemail to text– IBM ViaVoice SDK and custom audio libs

• Natural Language Processing– Directed word spotting by “understanding”

content– ViaVoice SRCL

• Pitch – Detecting important words by emphasized pitch

• Pause– Compression through pause removal

• Spurts– Retrieve sentence structure of voicemail

Page 14: Media Manager Mail Access Unified Messaging

Transcoding Techniques

Voice Mail -> Text Transcript Speech recognition

Voice Mail -> Text SummaryNLP, pitch detection and recognition

Voice Mail -> Text OutlinePause detection and speech recognition

E Mail -> Plain Audio Speech synthesis

E Mail -> GSM AudioSpeech synthesis and toast

Voice Mail -> Skimmed Audio Pause detection

Voice Mail -> Audio SummaryText summary and speech synthesis

Voice Mail -> GSM Summary Audio summary and toast

Page 15: Media Manager Mail Access Unified Messaging

ExamplesOriginal Voicemail:

“Hello, This is Barbara. How are you and the cats doing? I was wondering if you would feed them a little more the first time in case they eat too much. My number is (713) 465-5155. You can call me anytime. Have a very good holiday. Bye bye”

Processed Voicemail:

• Phyllis Barbara• Area in the cat staring• And then if you run but feed them• A little more the first time in case

they eat too much• On my number is (713) 465-

5155• You can call me anytime.• Have every holiday• Of light

Translated Talk spurts

(Pitch emphasized words in green)

(Skimmed) (Just pitch)

Translated using NLP•Hello this is Barbara•My number is (713) 465-

5155

Page 16: Media Manager Mail Access Unified Messaging

Examples continued...Original Voicemail:

“Faced with a seemingly inevitable engineering task authors tend to adopt one of two strategies for adding new services to the Internet landscape: inflexible, highly tuned, hand-constructed services….”

Processed Voicemail:Translated Talk spurts

(Pitch emphasized words in green)

(Skimmed) (Just pitch)

Translated using NLP

•<Nothing>

•Faced with a seemingly inevitable engineering task authors tend to adopt what it to strategies for adding new services to the internet landscape.

• Inflexible, highly Tate, had constructed services….”

Page 17: Media Manager Mail Access Unified Messaging

Results

• Pause detection– Worked well for given applications– Playback speedup by 50-70%

• Pitch detection– Problems due to high pitch sounds and

transitions

• Speech recognition– Performance decrease in conversational

settings

• Natural Language Processing– Performed well with small grammar

Page 18: Media Manager Mail Access Unified Messaging

Example: Adding GSM Acess

• Define a specific types, ie GSMAudio, GSMSummary

• Optionally create new Content Objects• Add Content Object definition to

MediaManager• Add add gsm transcoder to

TranscoderService

Page 19: Media Manager Mail Access Unified Messaging

Detail: Adding GSM Access

• Add Content Object definition to MediaManager– Define GSMAUDIO and GSMSUMMARY– Add cases to createObject() in Content

Object– Add cases to Media Manager

• Add GSM to Transcodeer– Add method toGSM() to Transcoder– Edit .config file

• External.transcoder.gsm rungsm

– Edit related transcoders• speechSynthesizer and audioSummary()

Page 20: Media Manager Mail Access Unified Messaging

Implementing Other Mail Stores

• Examples: IMAP, POP, Microsoft Exchange Server• Implement MailAccessIF

– String [] getMAFolders( userName )– MediaMessage [] getMAList( userName, folderName,

count )– MediaMessage getMAMessage( MediaRef )– ContentObject getMAMessageContent( ContentID )

• Add new protocol to Media Manager protocol table• Optionally add protocol for users in to FolderStore

Page 21: Media Manager Mail Access Unified Messaging

Conclusion• Overall

– System useful as navigational hints– To achieve total comprehension, need better voice recognition

• What works well– Skimming using pause removal– Detecting spurts for structure

• What needs work– Speech detection in conversational settings– Pitch emphasis needs refining

• Future Directions– Implementing more mail stores– Enhancing interfaces– Pause detection/word boundaries using speech detection– Developing voicemail grammars– Using NLP feedback with pitch emphasis detection– Improved speech detection in noisy environments

Page 22: Media Manager Mail Access Unified Messaging
Page 23: Media Manager Mail Access Unified Messaging

MediaManagerServiceIF

• String[] getFolders( userName )• byte[][] getFoldersAs( userName, returnType ) • MediaMessage [] getList( userName,

folderName, count )• byte[][] getListAs( userName, folderName,

count, returnType )• MediaMessage getMessage( MediaRef ) • ContentObject getMessageContent( ContentID,

returnType )

Page 24: Media Manager Mail Access Unified Messaging

Pitch Detection

• The Idea– A speaker’s pitch naturally changes when introducing

topics or emphasizing words [Hirshberg92]– Use pitch increases as hints for “important” words

• Algorithm [Aaron95]– Determine pitch for each 20 ms frame (FFT with SHS)– Set emphasis threshold to be top 1% of pitch values

(by histogram)– Mark 1 sec interval as emphasized if contains >=3

emphasized frames

Page 25: Media Manager Mail Access Unified Messaging

Pause Detection• Why is pause detection useful?

– Removing pauses speedups playback • Typically, 50-70% of original time [Foulke71]

– Long pauses signify groups (talk spurts)

• Noise and soft sounds create difficulties• Algorithm: Smoothed Histogram

[Lamet81]– Calculate energy per 10 ms frame– Threshold based on smoothed histogram (5 dB after

first peak)– Use heuristics to remove artifacts

Average energy (dB)

Percent of

Frames

Page 26: Media Manager Mail Access Unified Messaging

Results

• Pause detection– Worked well for given applications– Playback speedup by 50-70%

• Pitch detection– Problems due to high pitch sounds and

transitions

• Speech recognition– Performance decrease in conversational

settings

• Natural Language Processing– Performed well with small grammar

Page 27: Media Manager Mail Access Unified Messaging

Conclusion• Overall

– System useful as navigational hints– To achieve total comprehension, need better voice recognition

• What works well– Skimming using pause removal– Detecting spurts for structure

• What needs work– Speech detection in conversational settings– Pitch emphasis needs refining

• Future Directions– Implementing more mail stores– Enhancing interfaces– Pause detection/word boundaries using speech detection– Developing voicemail grammars– Using NLP feedback with pitch emphasis detection– Improved speech detection in noisy environments

Page 28: Media Manager Mail Access Unified Messaging

Works Cited

• [Arons95] B. Arons. Interactively Skimming Recorded Speech, Ph.D. dissertation, MIT 1985.

• [Foulke71] E. Foulke The Perception of Time Compressed Speech. Ch 4 in Perception of Language, edit by P.M. Kjeldergaaid, D.L. Horton, and J.J. Jenkins, Charles E. Merill Publishing Company, 1971. pp. 79-107

• [Hirshberg92] J. Hirschberg and B. Grosz. Intonational Features of Local and Global Discourse. In Proceedings of the Speech and Natural Language workshop (Harriman, NY, Feb. 23-26). Morgan Kaufman Publishers, 1992. pp. 441-446.

• [Lamel81] L.F. Lamel, L.R. Rabiner, A.E. Rosenberg, and J.G. Wilpson. An Improved Endpoint Detector for Isolated Word Recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing ASSP-29, 4. (Aug, 1981), 771-785.

Page 29: Media Manager Mail Access Unified Messaging

Architecture

Transcoder Service•Voicemail->Text Transcript

•Voicemail->Text Summary

•Voicemail->Text Outline

•Email ->Plain Audio

•Email -. GSM Audio

•Voicemail -> GSM Summary

•Voicemail->Audio Summary

•Voicemail->Skimmed Audio

Mail Access Interface

NinjaMail

Mail Access Interface

POP

Mail Access Interface

IMAP

Client

Client

Client

Folder Store

Media Manager Service

Media Manager Interface