SoundFlux - UC Berkeley School of Information · Fall Detection and Response is Critical The...
Transcript of SoundFlux - UC Berkeley School of Information · Fall Detection and Response is Critical The...
SoundFlux
A sound based fall detection system
Romulo Manzano, Matt Thielen, Mike Frazzini (and Randy)
The Problem
Falls are #1 Cause of Elderly Injury and Death1
1Source: NEISS All Injury Program, Office of Statistics and Programming, National Center for Injury Prevention and Control, the Centers for Disease Control and Prevention, and Consumer Product Commission. Raw Data: https://injuryfacts.nsc.org/all-injuries/deaths-by-demographics/top-10-preventable-injuries/data-details/1Source: World Health Organization,https://www.who.int/news-room/fact-sheets/detail/falls
9+millionfall-related ER visits 1
37.3+millionInjuries globally 2
Fall Detection and Response is Critical
● The medical outcome of a fall is largely dependent upon the response and rescue time. 1
● There are a significant number of documented cases of multiple falls, undetected and unmitigated, preceding an injurious or fatal fall. 1
1 Consultation with Anna Smith, ACE unit leader at University of Colorado Anschutz
Medical Center and on system wide Falls Task Force.
Fall Detection Solutions and State of the Art
1. Wearables are the Primary Solution Expensive | Intrusive | Complicated and Cumbersome
2. Tri-Axial Accelerometer Based Detection Worn on Torso is state of the art1:99.17% Sensitivity (true positives) | 99.69% Specificity (true negatives) | Simulated with ~200 falls
3. Contextual sound signal approach with accelerometer trigger is novel
1Source: Dongha Lim,1 Chulho Park,1 Nam Ho Kim,1,2 Sang-Hoon Kim,1 and Yun Seop Yu1, "Fall-Detection Algorithm Using 3-Axis Acceleration: Combination with Simple Threshold and Hidden Markov Model."Journal of Applied Mathematics, Volume 2014, Article ID 896030https://www.hindawi.com/journals/jam/2014/896030/
Our Solution
SoundFlux
● Non-Intrusive edge device for fall detection and response○ Does not require a wearable
○ Mic-array with accelerometer trigger○ Inference done on device○ Easy setup via mobile app connected to a Wifi network
● Cost effective solution without costly service commitments
● Upon fall detection, can perform various responses including notification escalation paths, and integration with local emergency medical responders, as well as health professionals and insurance providers
● Modeled after popular home assistants ( Google Home and/or Amazon Echo) with potential for integration
Introducing Randy
● We were unable to find any useful audio data on human falls, nor were we able to find any humans
willing to volunteer to fall at all (let alone hundreds of times), so we bought Randy.
● Randy is a 165 lb human-like CPR manikin with articulated joints and human weight distribution.
● Industry leading “simulaid” used by first responders (EMT, Fire, Police) in a variety of training programs.
● Note: At this stage of our model and inference, the class “falling dummy” represents a human fall.
Fall Detection Methodology
● Rescue Randy - Lifelike jointed 165 lb. rescue manikin with human weight distribution● Rig for Production of Simulated Falls ● Settings for ~130 trials:
○ Different floor types (2)○ Microphone Distances (5)○ Fall Types (3)○ Different Rooms/Buildings/Stairs (8)
Clumsy Randy Data Capture
Feature Extraction: Mel-Spectrograms
Why spectrograms?
● Audio signals are hard to analyze in time domain ( time vs.
amplitude)
● Frequency representation of audio which makes it easier to
surface additional information (frequency vs. amplitude)
● Spectrograms take chunks of smaller time->frequency
transformations and stack them over a time dimension
Mel-Spectrogram
● Frequency bands are equally spaced on the Mel scale
● Mel scale frequency to match more closely what humans
hear
● Increasingly used in music information retrieval and speech
recognition
● Initial concern: could we be filtering out relevant
information?
SoundFlux Model - V1
Spectrograms
Clumsy Randy
Dataset
FreeSound Kaggle
Dataset
Imagenet Dataset
(Pre-trained)+
Frozen Layers Trainable Layers
95% Accuracy
~450 images
Architecture: VGG16~15M parameters~7.3M trainable
Too good to be true?3 Classes:- Falling Randy (~130 samples)- Falling Object (~200 samples)- Generic Noise (~120 samples)
- FreeSound Kaggle*
Transfer Learning~14M images
*https://www.kaggle.com/c/freesound-audio-tagging/data
Fist 10 convolutional layers3 convolutional and 3 fully
connected layers
People Talk...
Falling Object+
People Talking Loudly
V1 Model Stress Test
- Falling Dummy: 0.3% probability- Falling Object: 0.1% probability - General Noise 99.6% probability
Shower...
Falling Object+
Running Shower
- Falling Dummy: 1.2% probability- Falling Object: 0.2% probability - General Noise 98.6% probability
Just Noise...
Falling Object+
Low Pitch White Noise
- Falling Dummy: 54.7% probability- Falling Object: 15.9% probability - General Noise 29.4% probability
95% Accuracy only when images look like this
Not so great...
Superposition: The foundation of Digital Signal Processing
● Audio files are digital signals and can be combined (additive synthesis)
● Samples from the Clumsy Randy Dataset were synthesized with relevant audio samples
● Resulting dataset was 40x the original size
Data augmentation scenarios:
● Group talk ambiance● Taking Shower● Music● Children playing ambiance● White noise
Source: https://freesound.org/
Data AugmentationApproach
Falling Object
People Talking
Falling Object w.
People Talking Overlay
Audio Spectrogram
Model Comparison:SoundFlux V1 vs. V2
V2 at a glance
97.5% Accuracy on augmented test set90% Accuracy on stress test dataset99% Recall on falling_dummy class (1% false negatives)
V1 Performance (no augmentation)
Performed extremely poorly on the augmented dataset. With a 36% F1 score on the falling dummy class, and 12% accuracy on stress test dataset.
Vs
V2 Performance (augmentation)
Generalizes well, tested on unseen augmented dataset and stress test dataset. F1 score 92% with Recall of 99% for falling_dummy class (applied higher weighting)
SoundFlux Architecture
Edge: Inference on device, no personal data necessarily transferred to the cloud (that includes recordings!)
● Device count scales according to deployment● Constant recording of audio into 5 sec wav files● Uses vibration readings as the ‘trigger’
If a fall is suspected, ping SoundFlux Cloud
Cloud: Overall account management, and notification framework:
● API to communicate with edge devices (receive inference results)
● Device management settings (opt in/out on metadata sharing, deploy model updates, etc)
● Communication framework
CloudAccount managementDevice management
Notification framework
Edge Device #1Trigger monitoringSample capturing
Inference and fall detectionSend alert to server
SoundFlux Architecture
Edge Device #NTrigger monitoringSample capturing
Inference and fall detectionSend alert to server
Response Time: 11s
Edge Device: “SoundFlux Home" Prototype
Raspberry Pi: $29.95● Raspian OS● Python venv w/ Tensorflow
Audio Sensor: $24.90● High quality 4-mic array
Vibration Sensor: $9.90● 3 axis accelerometer
Mitigating Edge Constraints● Temporal segmentation of raw sensor data● Targeted feature extraction and inference● Robust framework to purge unused data
Total Retail Cost: $64.75
To the Cloud!
● Implements Emergency Notifications
● View past events, data, and classifications
● Configure privacy and notification settings
SoundFlux Cloud Portal & API
Live Demo: https://demo.soundflux.io/
● Demo web application is live: https://demo.soundflux.io/ ○ API connecting front and back end also up and running
https://api.soundflux.io ○ Desktop and mobile friendly
● Future Features:○ Edge device management○ Data sharing and misclassification flagging
Next Steps
Next Steps (Post MIDS)
> Towards commercial deployment of SoundFlux:
Reduce
Response Time
Finalize
Hardware
Seek External
PartnershipsDiversify Data
Questions