SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.
-
Upload
william-booth -
Category
Documents
-
view
215 -
download
0
Transcript of SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.
![Page 1: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/1.jpg)
SPEECH CODING
Maryam ZebarjadAlessandro Chiumento
Supervisor : Sylwester Szczpaniak
![Page 2: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/2.jpg)
Outline
Properties of speech signals Why coding ? Implemented tecniques
Differential Pulse-Code Modulation DCT Tranfrorm Coder LPC Vocoder
Results
![Page 3: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/3.jpg)
SPEECH PROPERTIES
Speech is produced when air is forced from the lungs through the vocal cords and along the vocal tract.
It can be modeled by two states:
Voiced Speech: - produced by the vibrations of the vocal cords.- quasi-periodic in the time domain and harmonically
structured in the frequency domain.
Unvoiced Speech: - produced , for example, by high speed air passing through a constriction in the vocal tract (mouth and
lips)- random-like and broadband (like white noise).
![Page 4: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/4.jpg)
Why coding ?
Original speech signal has to be processed in order to be :
MINIMIZE DIMENSIONS (storage)
MINIMIZE BITRATE (transmission)
VOIPMOBILE TELEPHONY
![Page 5: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/5.jpg)
DPCM
We have done DPCM about a wave file and here is the result for different prediction orders:
-we have the coder and decoder signal for the prediction orders of 1, 2, 5, 10, 19.
-we have corresponding wave files for each stage-we also have the SNR for each prediction order
For the auto correlation method these were the basic formula as previously stated
![Page 6: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/6.jpg)
The DPCM Method with autocorrelation
![Page 7: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/7.jpg)
The Sriginal Signal
![Page 8: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/8.jpg)
Coder Signal for Prediction Order of 1
![Page 9: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/9.jpg)
Decoder Signal for Prediction Order of 1
![Page 10: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/10.jpg)
Coder Signal for the Prediction order of 2
![Page 11: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/11.jpg)
Decoder Signal for the Prediction Order of 2
![Page 12: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/12.jpg)
Decoder Signal for the Prediction Order of 2
![Page 13: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/13.jpg)
Coder Signal for the Prediction Order of 5
![Page 14: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/14.jpg)
Decoder Signal for the Prediction Order of 5
![Page 15: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/15.jpg)
Coder Signal for the Prediction Order of 10
![Page 16: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/16.jpg)
Decoder Signal for the Prediction Order of 10
![Page 17: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/17.jpg)
Coder Signal for the Prediction Order of 19
![Page 18: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/18.jpg)
Decoder Signal for the Prediction Order of 19
![Page 19: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/19.jpg)
SNR
Then by the following formula we calculate the Decoder SNR for each prediction order
![Page 20: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/20.jpg)
LPC VocoderVocoders rely strongly on the properties of speech.
Two – state excitation model: - pulses for voiced signal- random noise for unvoiced
signalVocal tract is modeled as an all-pole function.
Source-System synthesis model
where
![Page 21: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/21.jpg)
LPC Vocoder
We have to find: - pitch period- gain- poles of the system
![Page 22: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/22.jpg)
LPC Vocoder
V/UV DETECTION is done by taking the energy of each frame and compare it to a threshold. Taking the zero-crossing rate and compare it to a threshold.
PITCH DETECTION is done by Autocorrelation method : we cross-correlate the signal with it self,
the output has a max after the pitch period.
POLES OF THE SYSTEM are estimated using: LPC, in our case the LEVINSON-DURBIN algorithm
GAIN IS ESTIMATED : If the frame is UnVoiced we take the sqrt of the average power of
the frame. If the frame is Voiced we use the average power for every pitch
period.
![Page 23: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/23.jpg)
LPC VocoderORIGINAL SAMPLE
SYNTHETIZED SAMPLES
![Page 24: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/24.jpg)
DCT Transform Coder
There is no standard Same structure than vocoder
![Page 25: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/25.jpg)
DCT Transform Coder
Discrete Cosine Trasform is a unitary transform that expresses the incoming signal as a finite sum of cosine functions:
So if the signal is periodic we need a “small” number of cosines (coefficients)insteadif the signal is non periodic the cosines have to be many more.
![Page 26: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/26.jpg)
DCT Transform CoderVoiced frame : waveform DCT coefficients
Unvoiced frame : waveform DCT coefficients
![Page 27: SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.](https://reader035.fdocuments.us/reader035/viewer/2022062805/5697c0241a28abf838cd483e/html5/thumbnails/27.jpg)
DCT Transform CoderORIGINAL SAMPLE
Synthetized sample 22.5ms720 coeff V1460 coeff UV
22.5ms40 coeff V1460 coeff UV
50ms720 coeff V1460 coeff UV