Audio processing methods on marine mammal vocalizations
description
Transcript of Audio processing methods on marine mammal vocalizations
![Page 1: Audio processing methods on marine mammal vocalizations](https://reader035.fdocuments.us/reader035/viewer/2022062519/56814c46550346895db947fa/html5/thumbnails/1.jpg)
Audio processing methods on marine mammal vocalizations
Xanadu Halkias
Laboratory for the Recognition and Organization of Speech and Audiohttp://labrosa.ee.columbia.edu
![Page 2: Audio processing methods on marine mammal vocalizations](https://reader035.fdocuments.us/reader035/viewer/2022062519/56814c46550346895db947fa/html5/thumbnails/2.jpg)
Xanadu Halkias-www.ee.columbia.edu/~xanadu 2
Sound to Signal• sound is pressure variation of the medium (e.g. speech air pressure, marine mammals water pressure)
Pressure waves in water
Converting waves to voltage through a microphone
Time varying voltage
![Page 3: Audio processing methods on marine mammal vocalizations](https://reader035.fdocuments.us/reader035/viewer/2022062519/56814c46550346895db947fa/html5/thumbnails/3.jpg)
Xanadu Halkias-www.ee.columbia.edu/~xanadu 3
Analog to digital
sampling
quantizing
+
= digital signal
![Page 4: Audio processing methods on marine mammal vocalizations](https://reader035.fdocuments.us/reader035/viewer/2022062519/56814c46550346895db947fa/html5/thumbnails/4.jpg)
Xanadu Halkias-www.ee.columbia.edu/~xanadu 4
Time to frequency and back…
•Fourier transform=decompose a signal as a sum of sinusoids and cosines
Digital signal Fourier spectrum
Spectrum = the frequency content of the signal (energy/frequency band)
![Page 5: Audio processing methods on marine mammal vocalizations](https://reader035.fdocuments.us/reader035/viewer/2022062519/56814c46550346895db947fa/html5/thumbnails/5.jpg)
Xanadu Halkias-www.ee.columbia.edu/~xanadu 5
Back to sampling…
•Sampling needs to obey the Nyquist limit: ΩΤ2ΩΜ
•Signal has to be bandlimited eg. energy up to some frequency ΩΜ
•Audio is sampled at ΩΤ=2π44100Hz so spectrum has up to 22050Hz
![Page 6: Audio processing methods on marine mammal vocalizations](https://reader035.fdocuments.us/reader035/viewer/2022062519/56814c46550346895db947fa/html5/thumbnails/6.jpg)
Xanadu Halkias-www.ee.columbia.edu/~xanadu 6
Looking at sounds-The Spectrogram
•Looking at energy in time and frequency
![Page 7: Audio processing methods on marine mammal vocalizations](https://reader035.fdocuments.us/reader035/viewer/2022062519/56814c46550346895db947fa/html5/thumbnails/7.jpg)
Xanadu Halkias-www.ee.columbia.edu/~xanadu 7
More on spectrograms
![Page 8: Audio processing methods on marine mammal vocalizations](https://reader035.fdocuments.us/reader035/viewer/2022062519/56814c46550346895db947fa/html5/thumbnails/8.jpg)
Xanadu Halkias-www.ee.columbia.edu/~xanadu 8
Overview of marine mammal research
![Page 9: Audio processing methods on marine mammal vocalizations](https://reader035.fdocuments.us/reader035/viewer/2022062519/56814c46550346895db947fa/html5/thumbnails/9.jpg)
Xanadu Halkias-www.ee.columbia.edu/~xanadu 9
Call detection
•Detect different calls within the recording automatically
•Differentiate between species or identify the number of marine mammals in the region through overlapping of calls
•Tracking marine mammals through their calls
•Use calls to analyze and construct a possible language structure
What is it good for…
Problems
•Data, data, data…
![Page 10: Audio processing methods on marine mammal vocalizations](https://reader035.fdocuments.us/reader035/viewer/2022062519/56814c46550346895db947fa/html5/thumbnails/10.jpg)
Xanadu Halkias-www.ee.columbia.edu/~xanadu 10
Call detection approaches
•Noise is the biggest problem
•D. K. Mellinger et all use the cross-correlation approach
Cross-correlation is a way of measuring how similar two signals are
![Page 11: Audio processing methods on marine mammal vocalizations](https://reader035.fdocuments.us/reader035/viewer/2022062519/56814c46550346895db947fa/html5/thumbnails/11.jpg)
Xanadu Halkias-www.ee.columbia.edu/~xanadu 11
Call detection-kernel cross- correlation•This method requires manual interference and is performed on the signal waveform
Image obtained by D. K. Mellinger and C. W. Clark. "Methods for automatic detection of mysticete sounds", Mar. Fresh. Behav. Physiol. Vol. 29, pp. 163-181, 1997
![Page 12: Audio processing methods on marine mammal vocalizations](https://reader035.fdocuments.us/reader035/viewer/2022062519/56814c46550346895db947fa/html5/thumbnails/12.jpg)
Xanadu Halkias-www.ee.columbia.edu/~xanadu 12
Call detection-spectrogram correlation
Image obtained by D. K. Mellinger and C. W. Clark. "Methods for automatic detection of mysticete sounds", Mar. Fresh. Behav. Physiol. Vol. 29, pp. 163-181, 1997
![Page 13: Audio processing methods on marine mammal vocalizations](https://reader035.fdocuments.us/reader035/viewer/2022062519/56814c46550346895db947fa/html5/thumbnails/13.jpg)
Xanadu Halkias-www.ee.columbia.edu/~xanadu 13
“Voiced” calls
Energy appears in multiples of some frequency (=pitch)
![Page 14: Audio processing methods on marine mammal vocalizations](https://reader035.fdocuments.us/reader035/viewer/2022062519/56814c46550346895db947fa/html5/thumbnails/14.jpg)
Xanadu Halkias-www.ee.columbia.edu/~xanadu 14
Comments
•Both methods require manual measurements for the construction of the template
•The quality of the results depends highly on the noise present in the data
•Quality recordings at high sampling rates decide the course of action
•Correlation methods can’t capture all types of calls without constructing different kernels
![Page 15: Audio processing methods on marine mammal vocalizations](https://reader035.fdocuments.us/reader035/viewer/2022062519/56814c46550346895db947fa/html5/thumbnails/15.jpg)
Xanadu Halkias-www.ee.columbia.edu/~xanadu 15
Linear Predictive Coding
•Idea: the signal, x[n], is formed by adding white noise, e[n], to previous samples weighted by the linear predictive coefficients, a
X[z]E[z] 1/A[z]
•The number of coefficients defines the detail that we capture of the original signal
![Page 16: Audio processing methods on marine mammal vocalizations](https://reader035.fdocuments.us/reader035/viewer/2022062519/56814c46550346895db947fa/html5/thumbnails/16.jpg)
Xanadu Halkias-www.ee.columbia.edu/~xanadu 16
Linear Predictive Coding
•Used in speech for transmission purposes
•Intuition: LPCs model the spectral peaks of your signal
![Page 17: Audio processing methods on marine mammal vocalizations](https://reader035.fdocuments.us/reader035/viewer/2022062519/56814c46550346895db947fa/html5/thumbnails/17.jpg)
Xanadu Halkias-www.ee.columbia.edu/~xanadu 17
LPCs in marine mammal recordings•Model the peaks in the recordings that likely belong to calls that way we alleviate the problem of noise
•Unveils harmonic structure not visible in original spectrogram
![Page 18: Audio processing methods on marine mammal vocalizations](https://reader035.fdocuments.us/reader035/viewer/2022062519/56814c46550346895db947fa/html5/thumbnails/18.jpg)
Xanadu Halkias-www.ee.columbia.edu/~xanadu 18
Hidden Markov Models
•Machine learning involves training a general model based on your data in order to extract and predict desired features
•HMMs, Mj are defined by:
![Page 19: Audio processing methods on marine mammal vocalizations](https://reader035.fdocuments.us/reader035/viewer/2022062519/56814c46550346895db947fa/html5/thumbnails/19.jpg)
Xanadu Halkias-www.ee.columbia.edu/~xanadu 19
HMMs some more…
•Training: getting the parameters of the model, a, b, π
•Evaluating: we are given a sequence of states we want to know if the model produced them
•Decoding: we have some observations and we want to find out the hidden states
![Page 20: Audio processing methods on marine mammal vocalizations](https://reader035.fdocuments.us/reader035/viewer/2022062519/56814c46550346895db947fa/html5/thumbnails/20.jpg)
Xanadu Halkias-www.ee.columbia.edu/~xanadu 20
HMMs in marine mammal vocalizations
•HMMs could provide a call detection tool
•The data has to be workable
•Use frequencies of the spectrogram as hidden states
•Observe the spectrogram and use it for learning
•Tracking the call in the spectrogram
![Page 21: Audio processing methods on marine mammal vocalizations](https://reader035.fdocuments.us/reader035/viewer/2022062519/56814c46550346895db947fa/html5/thumbnails/21.jpg)
Xanadu Halkias-www.ee.columbia.edu/~xanadu 21
References
• D. P. Ellis
www.ee.columbia.edu/~dpwe/e6820
www.ee.columbia.edu/~dpwe/e4810
• D. K. Mellinger and C. W. Clark. "Methods for automatic detection of mysticete sounds", Mar. Fresh. Behav. Physiol. Vol. 29, pp. 163-181, 1997
• R. O. Duda, P. E. Hart, D. G. Stork. Pattern Classification, John Wiley & sons, inc. 2001