Pitch and time scale modifications

27
Prepared by: Doaa Gamal Lecturer Assistant Faculty of Engineering – Suez Canal University 1

Transcript of Pitch and time scale modifications

Page 1: Pitch and time scale modifications

Prepared by:Doaa Gamal

Lecturer Assistant Faculty of Engineering – Suez Canal University

1

Page 2: Pitch and time scale modifications

Outline

Introduction

Applications

History of time and pitch modification

Time-domain techniques

Frequency-domain techniques

Parametric techniques

conclusion

2

Page 3: Pitch and time scale modifications

Introduction Timescale modification: slow down or speed up a given

signal, possibly in a time-varying manner, withoutaltering the signal’s spectral content (and in particularits pitch when the signal is periodic).

pitch-scale modification: the aim is to modify thepitch of the signal, possibly in a time-varying manner,without altering the signal’s time-evolution (and inparticular, its duration).

3

Page 4: Pitch and time scale modifications

Introduction time-scaling or pitch-scaling is not easy because time

and frequency characteristics of a signal, being relatedby the Fourier transform, are not independent.

the simplest method of time scaling a sound is to justreplay it at a different rate. When using magnetictapes, for example, the tape speed may be varied, butthis incurs a simultaneous change in the pitch of thesignal.

4

Page 5: Pitch and time scale modifications

applications

Speech Synthesizers

Post-synchronization

Data compression

Reading for the blind:

Foreign language learning

Voice transformation

5

Page 6: Pitch and time scale modifications

History of time and pitch modification

Signal type

method technique

Analog tape recorder machine Time-domain

Digital Digital tape recorder Time-domain

Digital Periodicity-driven methods

Time-domain

Digital STFT Frequency-domain

Digital Linear prediction models & sinusoidal models

parametric models

6

Page 7: Pitch and time scale modifications

time and pitch modification techniques

Non-parametric

Frequency-domain

techniques

Time-domain techniques

Parametric

Page 8: Pitch and time scale modifications

Time-domain techniques

Pitch independent methods

requires very few calculations

very well to real-time implementation.

prone to artifacts because no precaution is taken at the splicing points, other than to guarantee continuity.

Page 9: Pitch and time scale modifications

Time-domain techniquesPeriodicity-driven methods

The most popular method using pitch information is TD-PSOLA

modification factors (between 0.5 and 2).

Page 10: Pitch and time scale modifications

TD-PSOLA analysis-synthesis process without modification

1 2

3

Page 11: Pitch and time scale modifications

TD-PSOLA analysis-synthesis process without modification

The output speech waveform of PSOLA analysis-synthesis is perceptually indistinguishable from the original waveform.

4

Page 12: Pitch and time scale modifications

pitch-scaling (lowering) using TD-PSOLA

12

Page 13: Pitch and time scale modifications

time-scaling (lengthening) using TD-PSOLA

13

Page 14: Pitch and time scale modifications

Computation of synthesis pitch-marks for pitch modification

14

Page 15: Pitch and time scale modifications

Computation of synthesis pitch-marks for pitch modification (raising)

Page 16: Pitch and time scale modifications

Computation of synthesis pitch-marks for duration modification

16

Page 17: Pitch and time scale modifications

Computation of synthesis pitch-marks for time-scale modification (lengthening)

Page 18: Pitch and time scale modifications

From the synthesis pitch-marks to the modified waveform

The simple way is

calculate the nearest analysis pitch-mark to the virtual pitch-mark is found

The frames which corresponds to the nearest analysis pitch-marks are centered on the synthesis pitch-marks.

The overlapping regions are added together.

18

Page 19: Pitch and time scale modifications

From the synthesis pitch-marks to the modified waveform

19

• In more sophisticated systems, the mapping involves linear interpolation between the two successive short-time analysis signals lying the closest to the virtual pitch-mark

The perceptual quality of the prosody modified speechusing PSOLA methods depends on the accuracy of thepitch markers estimation. As estimating epochs fromspeech provide more accurate pitch marker locations

Page 20: Pitch and time scale modifications

LP-PSOLA & FD-PSOLA

The Frequency-Domain PSOLA (FD-PSOLA) and theLinear-Predictive PSOLA (LP-PSOLA) approaches aretheoretically more appropriate than the time-domainPSOLA method for pitch-scale modifications becausethey provide independent control over the spectralenvelope of the synthesis signal.

Page 21: Pitch and time scale modifications

Frequency-domain techniques

Frequency-domain algorithms operate with a short-time spectrum of the signal (phase-vocoder)

1. Calculate shift-time Fourier transform (STFT) of a signal

2. Modify phases of each frequency channel.

3. Synthesize a signal using inverse STFT with a different time stride

21

Page 22: Pitch and time scale modifications

Parametric techniques linear prediction models

sinusoidal models

the Harmonic plus Noise Model, HNM

wideband models

STRAIGHT

Page 23: Pitch and time scale modifications

conclusion Time-domain approaches are computationally cheap

and perform good for small modification factors.

Good for real-time implementations

possible to incorporate such systems in consumerproducts such as telephone answering systems.

suffering from echos.

In particular, time or pitch-scale modifications bylarge factors cannot be carried out by time-domainmethods and usually require the use of the moreelaborate frequency-domain techniques.

23

Page 24: Pitch and time scale modifications

conclusion

Frequency-domain techniques are capable ofproviding very high quality output. However, they stillsuffer from some distortion, mainly due to the effectsof “phase dispersion.”

computationally intensive.

24

Page 25: Pitch and time scale modifications

conclusion Parametric techniques tend to outperform non-

parametric methods when the adequation between thesignal to be modified and the underlying model isgood. When this is not the case however, the methodsbreak down and the results are unreliable.

Parametric techniques usually are more costly in termsof computations, because they require an explicitpreliminary analysis stage for the estimation of themodel parameters.

25

Page 26: Pitch and time scale modifications
Page 27: Pitch and time scale modifications