Download - WH2014 Session: Vocal-diary a voice command based ground truth collection system for activity recognition

WLSACONVERGENCE SUMMIT

VOCAL-DIARY : A VOICE COMMAND BASED GROUND TRUTH COLLECTION SYSTEM FOR ACTIVITY RECOGNITION

ENAMUL HOQUE

UVA Center for Wireless Health 2

Vocal-Diary : A Voice Command based Ground Truth Collection System for Activity Recognition

Enamul HoqueRobert Dickerson

John Stankovic

3

Motivation• Learning regular

behavior very important for most home healthcare applications

• The underlying activity recognition system requires ground truth for training

4

Motivation

• Each research group develops their own ground truth collection system

• To facilitate future home healthcare research, we need a ground truth collection system that is:– Easy to install and use– Accurate– Reusable by other research groups for new studies

5

Existing Systems

• Not easy to use

Camera Real-time User Logging Daily Journal

6

Motivation

• Interaction with devices by voice becoming common

7

Challenges

• Residents may forget to log activities• Residents may forget to turn on microphone• Muti-resident homes• Ambient noise in homes• Privacy

8

Contributions

• Design, implementation and evaluation of Vocal-Diary, a privacy-aware, robust voice command based ground truth collection system for in-home activities

Easy-to-Use

Robust

Privacy-aware

Two-way acknowledgement

Speaker Recognition

Querying Residents(based on sensors)

Feat

ures

Nov

eltie

s

Publicly Available

9

System Description

Listen Voice Command

System ‘A’ Start / End

• Only listens to voice commands of specific format

10

System DescriptionListen Voice Command

Command DetectedSystem ‘A’ Start / End

Recognize Speaker

• Filters out noise and other speakers

11



Recognize Speaker

Playback Command

Are You Starting / Ending ‘A’?

Speaker Matched

Wait for Acknowledgment

• Corrects confusion among commands & filters out other conversations recognized as commands

12



Recognize Speaker

Playback Command


Speaker Matched


Recognize Speaker

Ack. Received

System Yes / No

13



Recognize Speaker

Playback Command


Speaker Matched


Recognize Speaker

Ack. Received

System Yes / No

Log Activity DetailsSpeakerMatched

14


Playback Command


Recognize SpeakerLog Activity Details

Command Detected

SpeakerMatched

Ack. Received

System ‘A’ Start / End

System Yes / No


Recognize Speaker

Speaker Matched

15

System Description

• Robust– Two-way acknowledgement and speaker recognition make Vocal-Diary robust

• Privacy aware– Only listens to voice commands in a specific format– No raw audio file containing residents’ voice is saved– Only start and end times of each activity are saved

• Ease of Use– Residents do not need to carry any microphone– No need to start the microphone before talking

16

Evaluation

• Data collected for 1 month each from 3 homes– Two single-resident homes– One double-resident home

• Evaluation Metrics

17

Evaluation (Single Resident Home 1)

• Speaker Recognition & Two-way acknowledgement are necessary for robustness

Slee

p

Brea

kfas

t

Din

ner

Lunc

h

Prep

are.

..

Cook

Snac

k

Dish

was

h

Toile

t

Show

er TV

Lapt

op Exit

0102030405060708090

100

Without Two-way Ack. & Speaker Recognition (SAPI)With Speaker Recognition OnlyWith Two-Way Ack. & Speaker Recognition (Vocal-Diary)

Activity

Prec

isio

n (%

)

18

Evaluation (Summary of Precisions)

1 2 30

10

20

30

40

50

60

70

80

90

100

Without Two-Way Ack. & Speaker Recognition (SAPI)

With Speaker Recognition Only

With Two-Way Ack. & Speaker Recognition (Vocal-Diary)

Home ID

Prec

isio

n (%

)

• Ambient noise introduces significant number of false positives

19

Evaluation (Summary of Recalls)

• If the resident gives a voice command, Vocal-Diary always detects it

1 2 380828486889092949698

100

Without Two-Way Ack. & Speaker Recognition (SAPI)

With Speaker Recognition Only

With Two-Way Ack. & Speaker Recogni-tion (Vocal-Diary)

Home ID

Reca

ll (%

)

20

Evaluation (Feasibility of Voice Commands)

• Vocal-Diary was deployed for 3 months in a home instrumented with different sensors

• The goal was to evaluate how many times the resident forgot to log activities using voice commands

• Ground truth for ground truth was collected by offline inference based on sensor firings

• 992 total activity instances, 59 not logged (6%)

21

Evaluation (Effectiveness of Querying Residents)

• Vocal-Diary was deployed for 15 days in a home instrumented with different sensors

• Controlled experiments• Number of times the resident did not use voice

commands: 25• Vocal-Diary queried in all 25 instances• Number of false queries: 14• Number of false queries if motion sensors are

ignored: 6

22

Conclusion

• To use Vocal-Diary, contact Professor John Stankovic ([email protected])

Easy-to-Use

Robust

Privacy-aware

Two-way acknowledgement

Speaker Recognition

Querying Residents(based on sensors)

Feat

ures

Nov

eltie

s

Publicly Available

WLSACONVERGENCE SUMMIT

www.wirelesshealth2014.org