VOICE INFORMATION RETRIEVAL FOR DOCUMENTS my own or was done in

VOICE INFORMATION RETRIEVAL FOR DOCUMENTS

Except where reference is made to the work of others, the work described in this thesis is

my own or was done in collaboration with my advisory committee.

__________________ Weihong Hu

Certificate of Approval:

_______________________________ ______________________________ W. Homer Carlisle Juan E. Gilbert, Chair Associate Professor Assistant Professor Computer Science and Software Computer Science and Software Engineering Engineering _______________________________ ______________________________ N. Hari Narayanan Stephen L. McFarland Associate Professor Acting Dean Computer Science and Software Graduate School Engineering

Weihong Hu

A Thesis

Submitted to

The Graduate Faculty of

Auburn University

In Partial Fulfillment of the

Requirements for the

Degree of

Master of Science

Auburn, Alabama

August 4, 2003

Weihong Hu

Permission is granted to Auburn University to make copies of this thesis at its discretion,

upon the request of individuals or institutions and at their expense. The author reserves

all publication rights.

__________________

Signature of Author

__________________

Copy sent to:

Name Date

THESIS ABSTRACT

Weihong Hu

Master of Science, August 4, 2003

68 Typed Pages

Directed by Dr. Juan E. Gilbert

Currently, new methods of interaction between people and the World Wide Web

are constantly emerging. Among them, voice is becoming more and more preferred.

Various voice applications (telephone-enabled applications) have been implemented and

used by governments, businesses, universities, libraries, visual impaired people etc.

However, very little attention has been given to document information retrieval using

voice because of existing technical difficulties and limitations with natural language

processing, voice recognition, grammar generation, result representation, etc.

This thesis explored the background of information retrieval using voice

especially Interactive Voice Response systems (IVR), several well-known existing

projects; and introduces the concepts of Voice Extensible Markup Language

(VoiceXML) [15]. A voice information retrieval system for documents (VIRD) has been

designed and implemented to search for documents from a database using the telephone

and VoiceXML. Five phases have been applied to this research: database creation and

normalization, user inquiries, denormalized view and stored procedures, summarization

functions, and user interface design.

In this research, an experiment has been conducted to measure the effectiveness

and the usability of VIRD. The PARADISE framework [17] was used to evaluate the

effectiveness of VIRD. Both Quantitative data and Qualitative data were collected. Two

sets of metrics were applied and analyzed. A careful analysis of the experiment data

revealed that VIRD achieved its effectiveness and user satisfaction as a mode of

document information retrieval via mobile access. However, it was also found that

improved recognition and improved representation for large result sets were required.

Finally, conclusions of this research are presented and future work that aims to improve

VIRD is suggested.

ACKNOWLEDGMENTS

The author would like to express her deep gratitude to her advisor, Dr. Juan E.

Gilbert, for his patient guidance, valuable advice, and continued encouragement

throughout her studies. Sincere thanks are also due to her two graduate committee

members, Dr. N. Hari Narayanan and Dr. W. Homer Carlisle, for their reviewing and

advising efforts. In addition, the author would like to thank her husband, Yapin Zhong,

for his help while conducting the experiment and constant support.

Voice Information Retrieval for Voice Information Retrieval for DocumentsDocuments

Weihong Hu

M.S. Thesis

Dept. of Computer Science & Software EngineeringAuburn University

OutlineOutline

MotivationLiterature reviewVIRD System Architecture & Voice User Interface (VUI)ExperimentFuture WorkDemo

MotivationMotivationA very large part of the world population does not have access to either computers or the InternetVery tiny visual interfaces make users feel quite uncomfortableBlind or partially-sighted users are not able to access information visuallyVoiceXML technologies provide an alternative way to search for document via mobile devicesVery little work involving VUI for document retrieval

Literature ReviewLiterature Review

Information Retrieval via VoiceVoiceXML TechnologyCommon VoiceXML applications

Information Retrieval via Voice Information Retrieval via Voice

Traditional Interactive Voice Response systems (IVR)– IVR systems are software applications that accept

telephone input and touch-tone keypad selection and provide appropriate responses

VoiceXML applications– Allow users to call into an application system and use a

combination of their voice and/or telephone input and/or touch tone keypad to interact with the system

– Use HTTP protocol to interact with Web server

VoiceXMLVoiceXMLVoice Extensible Markup Language (VoiceXML)A World Wide Web Consortium ( W3C) standard speech-application development languageDesigned for creating audio dialogues that feature synthesized speech, digitized audio, recognition of spoken and DTMF key input, recording of spoken input, telephony, and mixed-initiative conversationsAllow users to interact with the Internet without needing visualaccess Allow users to have complete control over the user-application interaction through spoken dialogues

Voice Portals & System Voice Portals & System InfrastructureInfrastructure

Voice Portals & Voice Portals & VoiceXMLVoiceXMLSystem Infrastructure (cont’d)System Infrastructure (cont’d)

Voice portals provide the running platforms for voice applicationsSome well-known voice portals– Tellme, VocalTec, and BeVocal

Common Common VoiceXMLVoiceXML applicationsapplications

Simple uses – Movie listings– Traffic information– Order tracking– Directory assistance– Personal information management

Complex uses – Business communications, virtual offices, and voice email– Web-based IVR speech-recognition enabled call centers – E-commerce– Airline reservations– Stock trades and financial services management

VIRD System ArchitectureVIRD System Architecture

VoiceXMLInterpreter Controller

Voice Server IR System Speech Interface

Speech

VIRD Document DatabaseVIRD Document Database

Twenty document abstractshttp://www.citeseer.comdatabase

Sample Document AbstractSample Document AbstractTitle:

Selectivity Estimation for Boolean QueriesAbstract:

In a variety of applications ranging from optimizing queries on alphanumeric attributes to providing approximate counts of documents containing several query terms, there is an increasing need to quickly and reliably estimate the number of strings (tuples, documents, etc.) matching a Boolean query. Boolean queries in this context consist of substring predicates composed using Boolean operators. While there has been some work in estimating the selectivity of substring queries,

Sample Sample VoiceXMLVoiceXML grammargrammar<grammar><![CDATA[[[(query)] {<keyword "query">}[(match)] {<keyword "match">}[(boolean)] {<keyword "boolean">}[(estimation)] {<keyword "estimation">}[(selectivity)] {<keyword "selectivity">}[(optimize)] {<keyword "optimize">}[(tuple)] {<keyword "tuple">}[(operator)] {<keyword “operator">}[(application)] {<keyword “application">}[(substring)] {<keyword “substring">}[(alphanumeric)] {<keyword “alphanumeric">}[(attribute)] {<keyword “attribute">}[(approximate)] {<keyword “approximate">}]]]></grammar>

PRINCIPLES OF VIRD VUI DESIGNPRINCIPLES OF VIRD VUI DESIGN

Continuous Representation – making the system’s capabilities apparent to the user as a

reminder at any point in the dialogues Immediate Impact – immediate, implicit confirmation must be provided

Incrementality– a sense of continuity and natural flow in the conversation

between the system agent and the user Summarization and Aggregation – the results must be condensed for audio-only interfaces due

to the constraint imposed by auditory memory limitations

Diagram of VIRD VUIDiagram of VIRD VUI

Welcome Message

Main Menu Dialogue

Query Dialogue

Results Dialogue

Save Dialogue Confirm MyLibraryConfirm Email

VIRD Voice User InterfaceVIRD Voice User Interface

Main Menu DialogueContains four search functions: keyword, title, author

or year

Query DialogueAllows the user to say the words that will be used

during the search

VIRD Voice User Interface VIRD Voice User Interface (cont’d)(cont’d)

Results DialogueVoice Navigator: Presents the list of retrieval

documents to the user through a list of voice command: NEXT, PREVIOUS, STOP, REPEAT, DETAIL, TRY AGAIN or SAVE

Save DialogueAllows the user to request a copy of the article via

email or library

ExperimentExperimentParticipants– Twenty Computer Science graduate and senior undergraduate

students in a User Interface Design course participated in this experiment at Auburn University (ten female, ten male)

Procedures– Came in, used the same telephone, sit in the same chair, in the

same room with the experimenter (as an observer) – read a one-page instruction sheet – interacted with the VIRD system to complete a task based on the

task scenario. – Task scenario: “You are working on a research paper for Dr. X’s

database course. Your research topic is XML. Dr. X wants you to find a document on the subject tree algebra for XML using the system. When you find the document, use the save option to let the system email it to you”

– filled out a survey giving subjective evaluation of the system’sperformance

Evaluation Evaluation MethodologyMethodology

Measuring user satisfaction of the voice user interface for the document retrieval systemPARADISE framework [1]

Evaluation Methodology Evaluation Methodology (cont’d)(cont’d)

Maximize user satisfaction

Maximize task success

Minimize costs

Efficiency measures

Qualitative measures

Evaluation Metrics Evaluation Metrics

The first set:– Task success– Dialogue efficiency – Dialogue qualitative

The second set:– Completion– Inaccuracy – Misinterpretation

Evaluation ResultEvaluation Result

Metrics ComparisonMetrics Comparison

Metrics Comparision Chart

86.50% 89.50%81% 85%

Task Success Dialogue Efficiency Dialogue Qualitative User Satisfactory

metrics

Series2

Series1

Evaluation Result (cont’d)Evaluation Result (cont’d)Time of CompletionTime of Completion

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Subjects

Series1

Evaluation Result (cont’d)Evaluation Result (cont’d)MisinterpretationMisinterpretation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

Subjects

Series1

Experiment SummeryExperiment Summery

Even though the misinterpretation rate is high, the user satisfaction is still high, this means, the user will accept the errors as long as they can recover from the errors easilyA potential flaw in PARADISE

Maximize user satisfaction

Maximize task success

Minimize costs

Efficiency measures

Qualitative measures

Future WorkFuture Work

Investigate Spoken Query Retrieval for Large Documents (Yapin’s research)Investigate a new usability model for Voice User Interface (Priyanka’s research)

DemoDemo

QuestionsQuestions

ReferencesReferences

1. C.A.Kamm & M.A.Walker. Design and evaluation of spoken dialog systems. In Proceedings of the ASRU Workshop, 1997.

VOICE INFORMATION RETRIEVAL FOR DOCUMENTS my own or was done in

Documents

Transcript of VOICE INFORMATION RETRIEVAL FOR DOCUMENTS my own or was done in

VFX VOICE MEDIA KIT 2019-2020vfxvoice.com/wp-content/uploads/2019/09/VFX-Voice...“VFX Voice focuses on the incredible work being done in the visual effects and animation industries

13.0 Voice-based Information Retrieval References: 1. “ Speech and Language Technologies for Audio Indexing and Retrieval ”, Proceedings of the IEEE, Aug.

kingfahadmosque.orgkingfahadmosque.org/wp-content/uploads/2013/12/Class-5-Handout… · PASSIVE VOICE FUTURE He/lt S/ will be done Thev D/ will be done The will be done She/ It S

DATA-DRIVEN SOLO VOICE ENHANCEMENT FOR JAZZ MUSIC RETRIEVAL · DATA-DRIVEN SOLO VOICE ENHANCEMENT FOR JAZZ MUSIC RETRIEVAL ... as input for the DNN and a jazz solo transcription similar

Introduction to Information Retrieval Information Retrieval Models

Your Hosted Voice solution he next generation of ...€¦ · the move to Hosted Voice offers numerous benefits, but if done incorrectly comes with a level of risk. Flaky infrastructure,

Information Retrieval Evaluation and the Retrieval Process.

Content Based Image Retrieval using Color and TextureLike color, the texture is a powerful low-level feature for image search and retrieval applications. Much work has been done on

The VOChap 7. There are 5 basic story form in TV News Readers Voice-overs (VOs) Voice-overs/sound on tape (VO/SOTs) Packages (Done by reporters)

Information Retrieval Language ModelInformation Retrieval INFO 4300 / CS 4300 ! Retrieval models – Older models » Boolean retrieval » Vector Space model – Probabilistic Models

Modern Information Retrieval Chapter 3 Retrieval Evaluation.

Information Retrieval – and projects we have done.

VOICE/AUDIO INFORMATION RETRIEVAL: … · – Search for key words, phrases, with Boolean and ... • Did I find anything that is useful? ... – Production call-recorder: ...

Retrieval Model Overview Boolean Retrieval Retrieval INFO 4300 / CS 4300 ! Retrieval models – Older models » Boolean retrieval » Vector Space model – Probabilistic Models »

Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval Retrieval by text Label database images by text tags Image retrieval as text retrieval.

Media Retrieval Information Retrieval Image Retrieval Video Retrieval Audio Retrieval Information Retrieval Image Retrieval Video Retrieval Audio Retrieval.

Voice Based Retrieval System Prajna Bhandary CMSC - 676 ...

For Chennai · * For Voice and Data KPI’s, 2G measurement is done with UE locked on 2G, 3G measurement is done with UE in Dual mode (2G & 3G) and 4G measurement is done with UE

Loddon Primary Federation · Web viewMy kid brother/sister My favourite animal Food Include some sentences in the active voice and some in the passive voice. Well done. Share your

Modern Information Retrieval Lecture 3: Boolean Retrieval.