hueber presentation ultrasfest 2013 - Ultraspeechgipsa-lab Thomas Hueber - CNRS/Gipsa-lab -...
Transcript of hueber presentation ultrasfest 2013 - Ultraspeechgipsa-lab Thomas Hueber - CNRS/Gipsa-lab -...
Ultraspeech+tools-!
Acquisi(on,!processing!and!visualiza(on!!of!ultrasound!speech!data!!
for!phone(cs!and!speech!therapy!
Thomas Hueber, CNRS/GIPSA-lab
gipsa-lab
Ultraspeech:tools!
Thomas Hueber - CNRS/Gipsa-lab - Ultrafest VI 2
Terason!!T3000!
Ultraspeech!
Acquisition
Ultraspeech:player!
Visualization Image processing
Ultramat!
(Matlab'toolbox)'
… of ultrasound, video, audio, inertial data
Context! Ultraspeech!–!Acquisi1on'
– Hardware!/!SoFware!(new!features!in!v1.2)!!– Inter:session!recalibra(on!procedure!!
Ultraspeech:player!–!Vizualiza1on' Demo!–!Real:(me!silent!speech!interface!driven!by!
ultrasound!imaging!!
Outline of my talk …
gipsa-lab
Context!
Thomas Hueber - CNRS/Gipsa-lab - Ultrafest VI 3
Silent!speech!interface!
!ar(culatory:to:acous(c!mapping!
Camera!
Ultrasound!probe!
Speech!!signal!
Visual!ar(culatory!data! “Possible implementation”
ultrasound probe
camera
Silent!speaker!(normal!ar(cula(on!!but!no!vocaliza(on)!
Hueber et al., (2009) "Development of a Silent Speech Interface Driven by Ultrasound and Optical Images of the Tongue and Lips", Speech Communication, 52(4), pp. 288-300.
Visual!ar(culatory!feedback!
Sta(s(cal!mapping! 3D!rendering!
GIPSA+lab--augmented-talking-head-Mul(modal!!
speech!data!
Hueber et al., 2012 Diandra Fabre PhD (2013-2016)
Mael Pouget PhD (2013-2016)
Hueber et al. 2007-2013
gipsa-lab
Ultraspeech!
Thomas Hueber - CNRS/Gipsa-lab - Ultrafest VI 4
Goal: Synchronous and simultaneous acquisition of high-speed ultrasound, video, and audio signals, at their respective maximum temporal resolution e.g.: US; (~80 fps, with 140° scanning angle and 8 cm depth) VIDEO: (>60 fps)
Hardware: - Ultrasound: lightweight firewire device Terason T3000 - Video: WDM-compliant high-speed industrial cameras (e g. Imaging Source) - Audio: optimized for ASIO-compatible soundcards (i.e. low-latency drivers)
User-friendly GUI – “push-button” Data available as series of bitmaps Tools for large database recording
Website: www.ultraspeech.com
Wiki (documentation, bug report)
New-features-of-latest-release-(1.2):--: Possibility!to!display!ultrasound!and!video!streams!during!
recording!(for!monitoring)!: Windows!7!compa(bility!(for!machine!with!>4Gb!RAM)!: Mul(:channel!audio!recording!(up!to!8,!theore(cally!!)!Currently'used'by'~6'labs'☺''
gipsa-lab
Ultraspeech! Recently!used!for!capturing!beatbox!and!Tradi(onal!Mongolian!singing!
Thomas Hueber - CNRS/Gipsa-lab - Ultrafest VI 5
Beatbox - EZRA - US+VIDEO+AUDIO Traditional Mongolian Music - Bayarbaatar Davaasuren
(diphonic + throat singing) - US+EGG+AUDIO -
Collaboration N.Henrich (GIPSA-lab)
gipsa-lab
Ultraspeech:!Monitoring!probe!movements!
Thomas Hueber - CNRS/Gipsa-lab - Ultrafest VI 6
Goals: Recording large amount of “consistent” data in multiple acquisition sessions (spaced in time) Designing a system helping the experimenter to reposition the ultrasound probe in a reference position,
relative to the speaker’s head Monitoring the probe movements during speech production
Our goal: Embedded system ! no external reference ! all the traking sensors should be attached to the speaker Applications: data recording “on the field” (outside of the lab), sensor calibration for silent speech interface & system of visual biofeedback
State-of-the-art
HOCUS system (LeHouillier et al., Haskins) Probe/head monitoring using OptoTrack MOCAP system
Palatron system (Mielke et al.) Probe/head monitoring using image processing
Both system need a reference outside the probe/head reference
gipsa-lab Thomas Hueber - CNRS/Gipsa-lab - Ultrafest VI 7
Proposed approach: Monitoring the orientation of the US probe relative to the speaker’s head using a set of inertial sensors (accelerometers & gyrometer)
ax = cosθ sinϕ
ay = sinθ sinϕ
az = cosϕ
⎧
⎨ ⎪
⎩ ⎪ x1!
y1!
z1!
θr!
φr!
z2!
Accelerometer ~ Inclinometer Two 3-axis accelerometers: relative orientation of 2 solids modulo a rotation around the vertical axis (gravity)
Principle: using 2 accelerometers to monitor Pitch and Roll angles of the probe relative to the head
Pitch-(midsagi\al!plane)!
Roll-(coronal!place)!
Ultraspeech:!Monitoring!probe!movements!
Warning: Orientation ≠ Position !
Yaw!angle!not!monitored!
gipsa-lab
Experimental!evalua(on!
Thomas Hueber - CNRS/Gipsa-lab - Ultrafest VI 8
Protocol: Task: re-positioning US probe to a specific orientation given pitch/roll angles estimated from the accelerometers (5 trials, 2 participants) Measure: Residual Pitch/roll angles after this repositioning, calculated from Qualisys MOCAP data (reference)
Goal: pitch/roll angles estimated from accelerometers data vs. MOCAP system (Qualiys)
gipsa-lab
Results!
Thomas Hueber - CNRS/Gipsa-lab - Ultrafest VI 9
r = 10 cm
r(1-cos ϕ)
< 2e-4mm g, z
rotation axis Rotation
center probe
r sin ϕ < 0.1 mm
Δϕ ≈ 0.58 °
Reference position
Instants after recalibration
Δφ = 0.58° (σ = 0.1°)Δθ = 0.58° (σ = 0.3°)
Pitch
Roll
Yaw Δψ = 9.75° (σ = 1°)As expected … gyrometer !
(coming soon … )
gipsa-lab
Ultramat! Open:source!Matlab!toolbox!to!process!ultrasound!and!video!data!recorded!using!
Ultraspeech! Beta!version!0.1!available!on!www.ultraspeech.com! Conversion!of!series!of!bitmaps!(ultrasound/video)!into!AVI!files! Extrac(on!of!visual!features!(contour!or!“global”!features)!
Thomas Hueber - CNRS/Gipsa-lab - Ultrafest VI 10
PCA!
α1 + α2 + ... + αn =
Training set
Edge tracking Initialization on semi-polar grid Contour extraction/tracking using snake (similar to EdgeTrak but … slower !)
EigenTongue feature extraction (Hueber, et al., ICASSP 2007)
Feel free to contribute ;-)
gipsa-lab
Ultraspeech:player! Standalone!soFware!dedicated!to!the!intuiEve!visualiza(on!of!ultrasound!speech!data!
recorded!with!Ultraspeech!! Designed!for!pronuncia(on!training!in!the!context!of!speech!therapy!and!second!language!
learning.!
Thomas Hueber - CNRS/Gipsa-lab - Ultrafest VI 11
Audiovisual time-stretching module allowing to slow-down both the articulatory gesture and its corresponding acoustic realization (based on Harmonic+Noise modeling)
Free to download (v 0.1) (software (PC / Mac OS) + databases) ! www.ultraspeech.com/player/
Superimposition of US + vocal tract shape, extracted from MRI scan of the same speaker
Main features “Player + database” architecture
gipsa-lab
Real:(me!Silent!speech!interface!driven!by!ultrasound!imaging!!
Thomas Hueber - CNRS/Gipsa-lab - Ultrafest VI 12
gipsa-lab
Conclusion!and!perspec(ves! Ultraspeech:tools!
– Ultraspeech:!Acquisi(on!of!mul(modal!speech!data!– Ultramat:!Processing!(matlab!toolbox)!– Ultraspeech:player:!Intui(ve!visualiza(on!!!!Free!to!download!at!www.ultraspeech.com!;:)!
Perspec(ves!– Ultraspeech:!!
Compa(bility!with!Telemed!ultrasound!system!(distributed!by!Ar(culate!Instruments)! Direct:to:disk!mode!!! Sensor:based!probe!posi(oning:!monitoring!rota(on!angle!around!ver(cal!axis!using!a!3!axis!gyrometer!(must!solve!the!driF!problem!…!)!
– Ultramat:!Improvements!of!edge!tracking!algorithms!– Ultraspeech:player:!release!of!beatbox!and!classical!singing!databases!(soon)!
!!Thomas Hueber - CNRS/Gipsa-lab - Ultrafest VI 13
gipsa-lab Thomas Hueber - CNRS/Gipsa-lab - Ultrafest VI
Thank!you...!!
THE END
14
...!for!your!a\en(on!!!