E healthresearchpresentation

22
1 Speech Matrix® Systems Speech Matrix® Systems Automated Speech Automated Speech Recognition Recognition for for Self-reports in Health Self-reports in Health Research Research Alex Levin, Alex Levin, Spacegate, Inc. Spacegate, Inc. [email protected] [email protected] Esther Levin Esther Levin Spacegate, Inc Spacegate, Inc City College of New York City College of New York [email protected] [email protected]

Transcript of E healthresearchpresentation

Page 1: E healthresearchpresentation

11

Speech Matrix® SystemsSpeech Matrix® Systems

Automated Speech Automated Speech Recognition Recognition

for for Self-reports in Health Self-reports in Health

ResearchResearch

Alex Levin, Alex Levin, Spacegate, Inc.Spacegate, Inc.

[email protected]@spacegate.com

Esther Levin Esther Levin Spacegate, IncSpacegate, Inc

City College of New YorkCity College of New [email protected]@spacegate.com

Page 2: E healthresearchpresentation

22 Confidential – Spacegate, Inc.Confidential – Spacegate, Inc.

AcknowledgementAcknowledgementSpeech Matrix Research & Development is supported by NIH, Speech Matrix Research & Development is supported by NIH,

NCI SBIR Phase I programNCI SBIR Phase I program

Page 3: E healthresearchpresentation

33 Confidential – Spacegate, Inc.Confidential – Spacegate, Inc.

Background Background Health Data CollectionHealth Data Collection

Traditional methodsTraditional methods Paper based diariesPaper based diaries ObservationsObservations

Modern methods – Electronic Data Capture (EDC) Modern methods – Electronic Data Capture (EDC) PDAPDA Web basedWeb based IVR (touch–tone)IVR (touch–tone)

Next Step - EDCNext Step - EDC ASR (automated speech recognition)ASR (automated speech recognition)

Page 4: E healthresearchpresentation

44 Confidential – Spacegate, Inc.Confidential – Spacegate, Inc.

Electronic Data Capture – Electronic Data Capture – Goals & Challenges Goals & Challenges

Real data in real-timeReal data in real-time Data validation Data validation Regulatory complianceRegulatory compliance Increase clinical trial efficiencyIncrease clinical trial efficiency Reduce time-to-marketReduce time-to-market CostCost Patient/Subject burdenPatient/Subject burden

Page 5: E healthresearchpresentation

55 Confidential – Spacegate, Inc.Confidential – Spacegate, Inc.

Automated SpeechAutomated SpeechTechnology – Science & ArtTechnology – Science & Art

Voice to Data - voice data entry via Voice to Data - voice data entry via automated natural dialogueautomated natural dialogue

Automated Dialogue – Voice FormsAutomated Dialogue – Voice Forms Voice Interface Design Voice Interface Design Real-time Monitoring & ReportingReal-time Monitoring & Reporting Privacy & SecurityPrivacy & Security Regulatory ComplianceRegulatory Compliance

Page 6: E healthresearchpresentation

66 Confidential – Spacegate, Inc.Confidential – Spacegate, Inc.

ASR System for Data CollectionASR System for Data Collection

Reporting Server

PSTN

Phone

Data

Web client

Speech Media Server

Telephony Interface

VXML Interpreter

ASR Engine

TTS

Prompts/Audio

DTMF

Application Server

Application Logic

Web server

ASR Grammars

Reporting tools

Monitoring tools

Page 7: E healthresearchpresentation

77 Confidential – Spacegate, Inc.Confidential – Spacegate, Inc.

Why Speech?Why Speech? Speech is a natural modality of interactionsSpeech is a natural modality of interactions Phone is user friendly and ubiquitous and no special training Phone is user friendly and ubiquitous and no special training

for its use is required.for its use is required. Dynamic Dialogue FlowDynamic Dialogue Flow

personalization of both content and style based on the profile personalization of both content and style based on the profile and history.and history.

Real time feedback and monitoring Real time feedback and monitoring real-time reports of captured datareal-time reports of captured data Automated Compliance monitoringAutomated Compliance monitoring

Flexible and extensive scheduling Flexible and extensive scheduling Inbound/outbound call sessionsInbound/outbound call sessions the calls can be initiated by a system following a prescribed the calls can be initiated by a system following a prescribed

protocolprotocol

Overall ASR based system offers an extensive and practical Overall ASR based system offers an extensive and practical tool to facilitate efficient and convenient real-time, two-way tool to facilitate efficient and convenient real-time, two-way communications.communications.

Page 8: E healthresearchpresentation

88 Confidential – Spacegate, Inc.Confidential – Spacegate, Inc.

Speech MatrixSpeech Matrix®-®-VDCVDC™™ SystemsSystems

Extensive data collection system for health, Extensive data collection system for health, clinical, life science and behavioral research.clinical, life science and behavioral research.

Another branch of EDCAnother branch of EDC Real-time data capture in a participant native Real-time data capture in a participant native

environmentenvironment

Page 9: E healthresearchpresentation

99 Confidential – Spacegate, Inc.Confidential – Spacegate, Inc.

ApplicationsApplications Pain Monitoring Diary (Pain Monitoring Diary (PMD™PMD™) ) prototype application - based prototype application - based

on questions in standard pain questioners (BPI)on questions in standard pain questioners (BPI) Drug use Diary (VDMDDrug use Diary (VDMD™™) ) : Prototype application - based on : Prototype application - based on

questions in standard Drug Use Diary (DR. Linda Sobell)questions in standard Drug Use Diary (DR. Linda Sobell)

Implements Dynamic QuestionnairesImplements Dynamic Questionnaires Interactivity & CommunicationsInteractivity & Communications Reporting and ManagementReporting and Management

Page 10: E healthresearchpresentation

1010 Confidential – Spacegate, Inc.Confidential – Spacegate, Inc.

Dialog DesignDialog Design Task characteristics: Task characteristics:

Need to guarantee data validity, accuracy and integrity, taking into Need to guarantee data validity, accuracy and integrity, taking into account speech recognition errors account speech recognition errors

improve the overall accuracy using dialog actions such as re-improve the overall accuracy using dialog actions such as re-prompts, confirmations, error handling, and, if necessary, recording prompts, confirmations, error handling, and, if necessary, recording and flagging the unrecognized utterances for later transcription and flagging the unrecognized utterances for later transcription

The system should accommodate both novice and experienced callers The system should accommodate both novice and experienced callers enough information and help to guarantee question understanding enough information and help to guarantee question understanding

and successful session completion for novice.and successful session completion for novice. short and effective call flow for experienced callershort and effective call flow for experienced caller

Subjects identify themselves in the beginning of the each session. Subjects identify themselves in the beginning of the each session. Opportunity to use the knowledge accumulated across sessions for Opportunity to use the knowledge accumulated across sessions for

personalization. personalization.

Subjects may receive some training on the use of the spoken dialog Subjects may receive some training on the use of the spoken dialog system during the enrollment session. system during the enrollment session.

Dialog Design Issues:Dialog Design Issues: controlling the captured data accuracy.controlling the captured data accuracy. adaptive level of user supportadaptive level of user support

Page 11: E healthresearchpresentation

1111 Confidential – Spacegate, Inc.Confidential – Spacegate, Inc.

Controlling the Accuracy of Data Controlling the Accuracy of Data Capture:Capture:ASR 101ASR 101

Speech Recognition Grammar DesignSpeech Recognition Grammar Design Example: yes/no grammar {yes, no}Example: yes/no grammar {yes, no}

Caller utterance is matched compared to the Caller utterance is matched compared to the possibilities described by the grammar possibilities described by the grammar

The output of ASR is the best matching The output of ASR is the best matching ‘sentence’, and a score ‘sentence’, and a score

If the score is too low => rejectionIf the score is too low => rejection Out-of-vocabulary utterances cannot be Out-of-vocabulary utterances cannot be

recognized correctlyrecognized correctly Design tradeoff: minimize out-of-Design tradeoff: minimize out-of-

vocabulary and minimize grammar sizevocabulary and minimize grammar size

Page 12: E healthresearchpresentation

1212 Confidential – Spacegate, Inc.Confidential – Spacegate, Inc.

Controlling the Accuracy of Data Controlling the Accuracy of Data Capture.Capture.

Improved rejection mechanisms to deal with out-of-vocabulary utterancesImproved rejection mechanisms to deal with out-of-vocabulary utterances..

System prompt: System prompt: Was that your left shoulder?Was that your left shoulder? User: User: no, left elbowno, left elbow System prompt: System prompt: I didn’t get that.I didn’t get that. Was that your left shoulder? Please say Was that your left shoulder? Please say ‘yes’ or ‘no’ .‘yes’ or ‘no’ .

Reliable confirmation recognition.Reliable confirmation recognition.

Using confirmations as the way to control the larger grammar’s accuracy.Using confirmations as the way to control the larger grammar’s accuracy.

Using recording to capture the out-of-grammar answers and problematic Using recording to capture the out-of-grammar answers and problematic user inputs.user inputs.

System prompt: “System prompt: “Was that your left shoulder?”Was that your left shoulder?” User: “User: “No”No” System prompt: “System prompt: “Sorry about that. Let’s try it this way. Please choose carefully a body part from the Sorry about that. Let’s try it this way. Please choose carefully a body part from the

following list that best describes the location of your pain, and just say it. If none of the locations following list that best describes the location of your pain, and just say it. If none of the locations match, please say ‘none of those’. Here is the list: abdomen <pause>, ankles …”match, please say ‘none of those’. Here is the list: abdomen <pause>, ankles …”

User (barges in) ”User (barges in) ”none of those”none of those” System prompt:System prompt: “Ok. Let me just record your answer. Please describe the location of your pain in “Ok. Let me just record your answer. Please describe the location of your pain in

your own words.”your own words.” User: User: <……><……> System prompt (after recording is finished): “System prompt (after recording is finished): “Thanks, I got that. Let’s move on.”Thanks, I got that. Let’s move on.”

recorded utterance is captured and flagged as “transcription is needed” for later recorded utterance is captured and flagged as “transcription is needed” for later processing. processing.

same mechanism of fall-back to recording instead of recognition is used after several same mechanism of fall-back to recording instead of recognition is used after several repeated recognition failures. repeated recognition failures.

Page 13: E healthresearchpresentation

1313 Confidential – Spacegate, Inc.Confidential – Spacegate, Inc.

Adaptive Level of User Adaptive Level of User Support-I.Support-I.

Prompt Design.Prompt Design. ““Where does it hurt? Where does it hurt? <pause>.<pause>. For example, your head For example, your head

stomach or back? stomach or back? <pause>.<pause>. Remember, if you don’t know Remember, if you don’t know how to answer this question, just say ‘I need help’ “. how to answer this question, just say ‘I need help’ “.

Context sensitive helpContext sensitive help.. help information describes and clarifies the current help information describes and clarifies the current

question,question, Provides examples of possible answers Provides examples of possible answers Example: help for the Example: help for the “where does it hurt”“where does it hurt” question: question:

““Okay. Here is the help information. At this point I need to Okay. Here is the help information. At this point I need to find out the part of your body that hurts the most. Please find out the part of your body that hurts the most. Please choose carefully a body part from the following list that best choose carefully a body part from the following list that best describes the location of your pain, and just say it. If none of describes the location of your pain, and just say it. If none of them matches, please say ‘none of those’. Here is the list: them matches, please say ‘none of those’. Here is the list: abdomen <pause>, ankles <pause>, back <pause>,...( list abdomen <pause>, ankles <pause>, back <pause>,...( list continues) …, toes <pause>. Which one is it?" continues) …, toes <pause>. Which one is it?"

Page 14: E healthresearchpresentation

1414 Confidential – Spacegate, Inc.Confidential – Spacegate, Inc.

Adaptive Level of User Adaptive Level of User Support-IISupport-II

Detecting speech recognition failuresDetecting speech recognition failures. . The re-prompts are designed as an escalating list, providing The re-prompts are designed as an escalating list, providing

increasingly more information and progressively constraining the increasingly more information and progressively constraining the caller. caller.

“ “Where does it hurt? <pause> For example, your head, Where does it hurt? <pause> For example, your head, stomach or back? <pause>. Remember, if you don’t know stomach or back? <pause>. Remember, if you don’t know how to answer this question, just say ‘I need help“.how to answer this question, just say ‘I need help“., ,

“ “I didn’t get that. Please tell me the part of your body that I didn’t get that. Please tell me the part of your body that hurts the most, Remember, you could always say ‘I need help’hurts the most, Remember, you could always say ‘I need help’ ”, ”,

Detecting MisunderstandingsDetecting Misunderstandings.. the user says “no” to a confirmation question as in:the user says “no” to a confirmation question as in:

System prompt: System prompt: Was that your left shoulder?Was that your left shoulder? User: User: No.No. System prompt: System prompt: Sorry about that. Let’s try it this way. Please Sorry about that. Let’s try it this way. Please

choose carefully a body part from the following list that best choose carefully a body part from the following list that best describes the location of your pain, and just say it. If none of describes the location of your pain, and just say it. If none of them matches, please say ‘none of those’. Here is the list: them matches, please say ‘none of those’. Here is the list: abdomen <pause>, … (list continues). Which one is it?abdomen <pause>, … (list continues). Which one is it?

Page 15: E healthresearchpresentation

1515 Confidential – Spacegate, Inc.Confidential – Spacegate, Inc.

AdaptiveAdaptive Level Level of User of User Support-IIISupport-III

Normal

Pin

Are you in pain?

Pain Location

Pain Intensity

Pain Type

Pain Always There?

Symptoms

message

end

yes

no

Follow-up

Pin

Are you in pain?

Pain Location

Pain Intensity

Pain Type

Pain Always There?

Symptoms

message

end

yes

no

Medication taken?

Medication timeno

yes

Dialog Personalization. Taking advantage of a larger inter-session context by designing two types data collection sessions: normal and follow up. Both content and style are different.Reminding prompts: “Is the pain still in your left shoulder?”.

Page 16: E healthresearchpresentation

1616 Confidential – Spacegate, Inc.Confidential – Spacegate, Inc.

SystemSystem:: Your pin please. Your pin please. Caller:Caller: **** **** System:System: Ok, *****, lets begin. Are you Ok, *****, lets begin. Are you

in pain now?in pain now?Caller:Caller: yes yes System:System: Where does it hurt?<pause> Where does it hurt?<pause>

For example, your head, stom... For example, your head, stom... Caller:Caller: my left shoulder my left shoulder System:System: Was that your left shoulder? Was that your left shoulder? Caller:Caller: yes yes System:System: Please rate your pain on a Please rate your pain on a

scale of zero to ten, with zero being no scale of zero to ten, with zero being no pain and ten being the worst possible pain and ten being the worst possible pain pain

Caller:Caller: it is seven it is seven System:System: as that 7?<pause> Please as that 7?<pause> Please

say yes or no. say yes or no. Caller:Caller: yeah yeah System:System: How does the pain feel? How does the pain feel? CallerCaller:: I don't know I don't know SystemSystem:: Okay, Here is the help Okay, Here is the help

information. At this point I need to find information. At this point I need to find out the type of your pain. Please out the type of your pain. Please choose carefully a word from the choose carefully a word from the following list that best describes your following list that best describes your pain and just say it. If nothing pain and just say it. If nothing matches, please say ‘none of those’. matches, please say ‘none of those’. Here is the list: aching, <pause> Here is the list: aching, <pause> burning, <pause> dull…burning, <pause> dull…

NORMAL SESSIONNORMAL SESSIONCallerCaller:: it feels sharp it feels sharp SystemSystem:: Was that ‘sharp’? Was that ‘sharp’? Caller:Caller: yeah yeah System:System: Is the pain constant or does Is the pain constant or does

it come and go? it come and go? Caller:Caller: comes and goes comes and goes System:System: Do you have any other Do you have any other

symptoms? symptoms? Caller:Caller: yeah yeah System:System: Please tell me your most Please tell me your most

disturbing symptom. For... disturbing symptom. For... Caller:Caller: I have fever I have fever System: System: I heard ‘fever’. Is that right? I heard ‘fever’. Is that right? Caller:Caller: yes yes System:System: Would you like to leave a Would you like to leave a

message for the staff? … message for the staff? … Caller:Caller: nonoSystem:System: Thanks for your Thanks for your

cooperation. I've got all the cooperation. I've got all the information that I need. I'll call you information that I need. I'll call you again in 4 hours. Goodbye.again in 4 hours. Goodbye.

Page 17: E healthresearchpresentation

1717 Confidential – Spacegate, Inc.Confidential – Spacegate, Inc.

System:System: Is the pain still in your left Is the pain still in your left shoulder? shoulder?

Caller:Caller: yesyesSystem:System: Does the pain still feels sharp? Does the pain still feels sharp? Caller:Caller: nonoSystem:System: How does the pain feel? How does the pain feel? CallerCaller:: It’s dull It’s dull SystemSystem:: Was that ‘dull’’? Was that ‘dull’’? Caller:Caller: yeah yeah System:System: Is the pain constant or does it Is the pain constant or does it

come and go? come and go? Caller:Caller: It’s constant It’s constant System:System: Do you have any other Do you have any other

symptoms? symptoms? Caller:Caller: no no System:System: Would you like to leave a Would you like to leave a

message for the staff? … message for the staff? … Caller:Caller: nonoSystem:System: Thanks for your cooperation. Thanks for your cooperation.

I've got all the information that I need. I've got all the information that I need. I'll call you again tomorrow. Goodbye.I'll call you again tomorrow. Goodbye.

FOLLOW-UP SESSIONFOLLOW-UP SESSION

SystemSystem:: Your pin please. Your pin please. Caller:Caller: **** **** System:System: Hi, Esther, This is your follow-up Hi, Esther, This is your follow-up

call … Are you still having pain?call … Are you still having pain?Caller:Caller: yeh yeh System:System: Did you take your pain Did you take your pain

medication today?medication today?Caller:Caller: yesyesSystem:System: What time did you take the What time did you take the

medication?medication?CallerCaller:: Two hours agoTwo hours agoSystem:System: Was that around 9:45 am?Was that around 9:45 am?Caller:Caller: yeh yeh System:System: Did the medicine diminish your Did the medicine diminish your

pain?... Pleasepain?... PleaseCaller:Caller: yeh yeh System:System: Please rate your pain on a scale of Please rate your pain on a scale of

zero to ten, with zero being no pain and zero to ten, with zero being no pain and ten being the worst possible pain.ten being the worst possible pain.

Caller:Caller: four fourSystem:System: Was that 4? Was that 4? Caller:Caller: yeah yeah

Page 18: E healthresearchpresentation

1818 Confidential – Spacegate, Inc.Confidential – Spacegate, Inc.

Normal session reportNormal session reportCaptured ValueCaptured Value Confirmed Confirmed

(yes/no)(yes/no)Confidence ScoreConfidence Score

PinPin ******** nono 6666Are you in pain?Are you in pain? yesyes nono 8080Pain LocationPain Location left shoulderleft shoulder yesyes 8686Pain IntensityPain Intensity 77 yesyes 8888Pain TypePain Type sharpsharp yesyes 8888Pain constant?Pain constant? pain comes and pain comes and

goesgoesnono 4747

SymptomsSymptoms feverfever yesyes 8686MessageMessage nonenone nono 7878

Page 19: E healthresearchpresentation

1919 Confidential – Spacegate, Inc.Confidential – Spacegate, Inc.

Follow-up session reportFollow-up session reportCaptured ValueCaptured Value ConfirmeConfirme

ddConfidence ScoreConfidence Score

PinPin ********** nono 7474In pain?In pain? yesyes nono 8585Medication taken?Medication taken? yesyes nono 7676Medication timeMedication time 9:45 am9:45 am yesyes 6969Medication helped?Medication helped? yesyes nono 7575Pain RatingPain Rating 44 yesyes 8787Pain LocationPain Location left shoulderleft shoulder yesyes 8787Pain TypePain Type dulldull yesyes 8686Pain constant?Pain constant? constantconstant nono 5454SymptomsSymptoms nonenone yesyes 8282MessageMessage nonenone nono 8484

Page 20: E healthresearchpresentation

2020 Confidential – Spacegate, Inc.Confidential – Spacegate, Inc.

Usability TestUsability Test 24 subjects24 subjects 118 dialog sessions118 dialog sessions

113 completed113 completed 5 hang-ups5 hang-ups 42 follow-up42 follow-up

1766 dialog turns1766 dialog turns 98% automatic data capture – the 98% automatic data capture – the

rest flagged for transcriptionrest flagged for transcription

Page 21: E healthresearchpresentation

2121 Confidential – Spacegate, Inc.Confidential – Spacegate, Inc.

ResultsResultsSession duration (sec)Session duration (sec) 105.6(46.78)105.6(46.78)

Number of dialog units per sessionNumber of dialog units per session 7.85 (2.6)7.85 (2.6)

Duration of dialog unit (sec)Duration of dialog unit (sec) 13.46 (4.54)13.46 (4.54)

Dialog turns per dialog unitDialog turns per dialog unit 1.88 (0.46)1.88 (0.46)

Percentage of task oriented turnsPercentage of task oriented turns 80% (16)80% (16)

Percentage of barged-in promptsPercentage of barged-in prompts 66% (13)66% (13)Time duration of a dialog turn (sec)Time duration of a dialog turn (sec) 7.19 (1.10)7.19 (1.10)

Time duration of a dialog turn when barge-in Time duration of a dialog turn when barge-in was disabledwas disabled

10.63(1.5)10.63(1.5)

Page 22: E healthresearchpresentation

2222 Confidential – Spacegate, Inc.Confidential – Spacegate, Inc.

SummarySummary ASR & Spoken Dialog Methodology for data ASR & Spoken Dialog Methodology for data

capture can provide: capture can provide: Additional real-time data collection toolAdditional real-time data collection tool Flexible protocol designFlexible protocol design Improves data validation and compliance Improves data validation and compliance Centralized collection and monitoringCentralized collection and monitoring Telephone as ubiquitous deviceTelephone as ubiquitous device

System design needs to take into account the System design needs to take into account the specificities of the task and the limitations of the specificities of the task and the limitations of the technologytechnology Flexible level of user supportFlexible level of user support Controlled accuracy of the captured data Controlled accuracy of the captured data