Analysis/Synthesis of the andean Quena via Harmonic Band ...

4
Analysis/Synthesis of the andean Quena via Harmonic Band Wavelet Transform Aldo Díaz , Rafael Mendes (Orientador) Departamento de Engenharia de Computação e Automação Industrial (DCA) Faculdade de Engenharia Elétrica e de Computação (FEEC) Universidade Estadual de Campinas (Unicamp) Caixa Postal 6101, 13083-970 – Campinas, SP, Brasil {alddiaz,rafael}@dca.fee.unicamp.br Abstract – Digital Signal Processing (DSP) is a powerful engineering technique in audio processing for signal analysis and synthesis. One of the audio DSP challenges in musical instruments analysis is the definition of good signal representations for the description of relevant sound characteristics and extraction of information. The goal of this study is the analysis of the peruvian quena flute by means of the Harmonic Band Wavelet Transform (HBWT), a convenient representation for the harmonic sound content based on its 1/f fractal characteristics. The analysis of two sound transformations and a comparison with recorder and melodica wind instruments is established. The sound transformations are noise filtering and pitch-shifting. The sound comparison is focused on the γp fractal attribute. It can be concluded that the HBWT is a relevant representation for quena sound transformations and the γp fractal feature has great potential for instrument comparison and classification. Keywords – quena analysis; 1/f noise; noise filtering; pitch-shifting; sound fractal comparison. 1. Introduction Recently, fractal decomposition techniques are ob- taining interesting results in musical instruments recognition and classification compared to classi- cal state of the art methods [6, 1]. Most of the techniques use the box-counting method for signal decomposition [2]. A different approach proposed the usage of DSP Perfect Reconstruction (PR) and Multirate Filter Banks (MFB) techniques [4]. PR and MFB concepts are interesting because the sig- nal decomposition is performed by frequency over- lapping filters that allow aliasing in order to achieve distortion-free signal reconstruction. For instance, digital audio file formats like mp3 or Ogg Vorbis use the advantages of PR and MFB technologies for au- dio compression applications. Based on the PR and MFB reasoning, it was hypothesized that the Har- monic Band Wavelet Transform (HBWT) [4] is rel- evant for sound transformations and signal analysis. So far, the HBWT method was already used for au- dio coding and compression applications, whereas the objective of this study is explore this technique as a tool for quena sound transformations, analysis and comparison with the recorder and melodica in- struments based on the HBWT γ p fractal parameter. 2. Method It was generated a sound database with recordings of quena, recorder, and melodica notes. The sam- ples were notes of one second duration played with mezzo-forte intensity, recorded in mono channel, B3 C4 D4 E4 F4 G4 A4 B4 C5 D5 E5 F5 G5 A5 B5 C6 D6 Melodica Quena Recorder Figure 1. Frequency ranges of the analysis instruments where the overlap between them can be seen. sample rate of 44.1 kHz and 16 bits of resolution. A total of 27 sounds was generated, 8 for quena, 9 for recorder and 10 for melodica (Figure 1). 2.1. HBWT decomposition The HBWT belongs to the family of Spectral Mod- eling Synthesis (SMS) methods. Most of SMS methods lack of an explicit model for the stochas- tic component of sound (noise), which is indispens- able for the perception of musical instruments. The HBWT, based on the pseudo-periodic 1/f noise model, provides an specific model for noise charac- terization [4]. The HBWT matches the signal fractal characteristic (pseudo-periodic 1/f noise), which is inherent to the sound of musical instruments, in or- der to decompose the sidebands of the harmonics. In figure 2 it is depicted the block diagram of HWBT implementation. 2.2. Analysis The HBWT analysis structure is composed by per- fect reconstruction filter banks of Modified Discrete

Transcript of Analysis/Synthesis of the andean Quena via Harmonic Band ...

Page 1: Analysis/Synthesis of the andean Quena via Harmonic Band ...

Analysis/Synthesis of the andean Quena via Harmonic BandWavelet Transform

Aldo Díaz , Rafael Mendes (Orientador)

Departamento de Engenharia de Computação e Automação Industrial (DCA)Faculdade de Engenharia Elétrica e de Computação (FEEC)

Universidade Estadual de Campinas (Unicamp)Caixa Postal 6101, 13083-970 – Campinas, SP, Brasil

{alddiaz,rafael}@dca.fee.unicamp.br

Abstract – Digital Signal Processing (DSP) is a powerful engineering technique in audio processing for signalanalysis and synthesis. One of the audio DSP challenges in musical instruments analysis is the definition of goodsignal representations for the description of relevant sound characteristics and extraction of information. The goal ofthis study is the analysis of the peruvian quena flute by means of the Harmonic Band Wavelet Transform (HBWT),a convenient representation for the harmonic sound content based on its 1/f fractal characteristics. The analysis oftwo sound transformations and a comparison with recorder and melodica wind instruments is established. The soundtransformations are noise filtering and pitch-shifting. The sound comparison is focused on the γp fractal attribute.It can be concluded that the HBWT is a relevant representation for quena sound transformations and the γp fractalfeature has great potential for instrument comparison and classification.

Keywords – quena analysis; 1/f noise; noise filtering; pitch-shifting; sound fractal comparison.

1. IntroductionRecently, fractal decomposition techniques are ob-taining interesting results in musical instrumentsrecognition and classification compared to classi-cal state of the art methods [6, 1]. Most of thetechniques use the box-counting method for signaldecomposition [2]. A different approach proposedthe usage of DSP Perfect Reconstruction (PR) andMultirate Filter Banks (MFB) techniques [4]. PRand MFB concepts are interesting because the sig-nal decomposition is performed by frequency over-lapping filters that allow aliasing in order to achievedistortion-free signal reconstruction. For instance,digital audio file formats like mp3 or Ogg Vorbis usethe advantages of PR and MFB technologies for au-dio compression applications. Based on the PR andMFB reasoning, it was hypothesized that the Har-monic Band Wavelet Transform (HBWT) [4] is rel-evant for sound transformations and signal analysis.So far, the HBWT method was already used for au-dio coding and compression applications, whereasthe objective of this study is explore this techniqueas a tool for quena sound transformations, analysisand comparison with the recorder and melodica in-struments based on the HBWT γp fractal parameter.

2. MethodIt was generated a sound database with recordingsof quena, recorder, and melodica notes. The sam-ples were notes of one second duration played withmezzo-forte intensity, recorded in mono channel,

B3 C4 D4 E4 F4 G4 A4 B4 C5 D5 E5 F5 G5 A5 B5 C6 D6

Melodica

Quena

Recorder

Figure 1. Frequency ranges of the analysisinstruments where the overlap between themcan be seen.

sample rate of 44.1 kHz and 16 bits of resolution.A total of 27 sounds was generated, 8 for quena, 9for recorder and 10 for melodica (Figure 1).

2.1. HBWT decompositionThe HBWT belongs to the family of Spectral Mod-eling Synthesis (SMS) methods. Most of SMSmethods lack of an explicit model for the stochas-tic component of sound (noise), which is indispens-able for the perception of musical instruments. TheHBWT, based on the pseudo-periodic 1/f noisemodel, provides an specific model for noise charac-terization [4]. The HBWT matches the signal fractalcharacteristic (pseudo-periodic 1/f noise), which isinherent to the sound of musical instruments, in or-der to decompose the sidebands of the harmonics. Infigure 2 it is depicted the block diagram of HWBTimplementation.

2.2. AnalysisThe HBWT analysis structure is composed by per-fect reconstruction filter banks of Modified Discrete

Page 2: Analysis/Synthesis of the andean Quena via Harmonic Band ...

H(z)~ 2

G(z)~ 2

H(z)~ 2

G(z)~ 2

F0(z)~ P

b0,1 b0,2

H(z)~ 2

G(z)~ 2

b0,N

a0,NH(z)2

G(z)2b0,N

a0,N+ H(z)2

G(z)2b0,2

H(z)2

G(z)2b0,1

++ F0(z)P +

H(z)~ 2

G(z)~ 2

H(z)~ 2

G(z)~ 2

FP-1(z)~ P

bP-1,1 bP-1,2

H(z)~ 2

G(z)~ 2

bP-1,N

aP-1,NH(z)2

G(z)2bP-1,N

aP-1,N+ H(z)2

G(z)2bP-1,2

H(z)2

G(z)2bP-1,1

++ FP-1(z)P

P Channel MDCT

P Channel IMDCT

DWT

DWT

IDWT

IDWT

(a) (b)

Figure 2. Structure of filters implementing theHarmonic Band Wavelet Transform. (a) Analy-sis bank. (b) Synthesis bank.

0 445 891 1336Frequency (Hz)

100

80

60

40

20

0

Mag

nitu

de (d

B)

p=1 p=2 p=3 p=4 p=5 p=6

Figure 3. MDCT channels synchronized withthe average period P = 99 of quena A4 signal.

Cosine Transform (MDCT) and Discrete WaveletTransform (DWT), both techniques widely used inaudio coding and compression. The function ofMDCT filters is to separate each harmonic of thesignal into two sidebands, one on each side. For theimplementation of the MDCT bank it is used cosinemodulated type IV bases [3]. The input sound signalis denoted by s(l). The the number of MDCT chan-nels P is equal to the average period of s(l) in num-ber of samples. Figure 3 represents the spectrum ofthe first three harmonics of a quena A4 (Fundamen-tal frequency f0 = 444.8 Hz) and the MDCT filterssynchronized to the average period P = 99. Fol-lowing the recommendations in [4], the DWT filterbank is implemented using order 11 Daubechies fil-ters. The number of scale levels N varies accordingto the instrument. In case of the quena, N is upto 4 levels suitable. In figure 4 it is shown the fre-quency responses of an 8-channel MDCT and order11 Daubechies filters implementing the DWT.

The analysis procedure has the followingsteps:

Step 1 Decomposition of input signal s(l) by aMDCT filter bank with P channels. Eachfilter F̃p(z) is band-pass type with band-

0 π/4 π/2 3π/4 π

Normalized frequency

0.0

0.2

0.4

0.6

0.8

1.0

Mag

nitu

de D

FT H(z) G(z)

Daubechies-11 filters

0 π/4 π/2 3π/4 π40

30

20

10

0

10

Mag

nitu

de (d

B)

p=0 p=1 p=2 p=3 p=4 p=5 p=6 p=7

8-channel MDCT

Figure 4. Frequency responses of MDCT andDWT.

width equal to π/M .Step 2 Decimation bank of factor P .Step 3 Wavelet decomposition at the output of dec-

imation blocks by a P -channel DWT filterbank.

As a result of the analysis it is obtained thedecomposition of the sidebands of the harmonics ofthe signal s(l) in two components, a deterministicone represented by the scale coefficients ap,N anda stochastic one represented by the wavelet coeffi-cients bp,n, where p is the channel index, n the scaleindex, and N the total number of scale levels.

2.3. SynthesisStep 1 Wavelet Reconstruction by a P -channel fil-

ter bank implementing the inverse of theDWT (IDWT).

Step 2 Expansion bank of factor P .Step 3 MDCT reconstruction by a P -channel

IMDCT (inverse of MDCT).

The main idea of HBWT is to model a sound byassuming that it is composed by a set of periodic1/f -like stochastic processes. The formal defini-tion of this processes can be found in [5]. Oneof the benefits of using the 1/f noise-like modelis that only two parameters are needed to describethe complete behavior of the sound signal, the pa-rameter σ2p that controls the amplitude and the pa-rameter γp that controls the slope of the pseudo-periodic 1/f -noise-like sidebands of the spectrum.From Proposition 3.4 in [4] it can be found the re-sult for the variances of the analysis coefficientsV ar{bp,n[m]} = σ2p2

nγp . Considering the loga-rithm of the energies of each of the subbands n of a

Page 3: Analysis/Synthesis of the andean Quena via Harmonic Band ...

single sideband p it is found a linear relationship forthe corresponding parameter γp:

log2(V ar{bp,n[m]}) = γpn+ const, (1)

where the spectral parameter γp is related to theself-similarity parameter Hp (Hurst exponent) ac-cording to the relationship of equation 2. In caseof fractional Brownian motion (fBm) or 1/f noise0 < H < 1, 1 < γ < 3 [5].

γp = 2Hp + 1 (2)

3. Results

3.1. Harmonic 1/f noise filteringTwo experiments were developed in order to filterthe 1/f noise. In the first experiment, Synthesis A,the wavelet expansion coefficients bp,n (3 scale lev-els) were used as inputs for the synthesis. In Syn-thesis B, the wavelet scale coefficients ap,N (3 scalelevels) were used as inputs. The results are shownin figure 5. Experiment A resulted in isolation ofthe 1/f noise associated to sidebands of harmon-ics. In experiment B, the result was the 1/f filteredsignal (purely harmonic content). This results favorthe study of the signal fluctuations related to the 1/fnoise. The 1/f noise provides an important contri-bution to the sound in terms of perception. In caseof the quena, it was found that the 1/f signal of Syn-thesis A was related to the blowing air, that is the ac-tion mechanism of the instrument. Furthermore, the1/f fractal characteristic is curiously repeated in avariety of physical and biological phenomenons thatencourages our study [5].

3.2. Pitch-shifting via HBWTIn sound synthesis, pitch-shifting is a technique thatconsists in modification (shift) of the fundamentalfrequency f0 (pitch). In terms of HBWT decom-position, pitch-shifting is interpreted as a frequencymodulation process as the frequency content ex-tracted from the signal analysis is translated to newfrequency bands in the synthesis stage. The benefitof the implementation via HBWT is the reductionof the procedure to only two steps: Step 1: Designof the MDCT synthesis filter bank with appropriatenumber of channels P2 = bfs/f2c, in order to mod-ulate to the new frequency f

′0. Step 2: Apply the

0 500 1000 1500 2000

100

80

60

40

20

0

Mag

nitu

de (d

B)

0 500 1000 1500 2000Frequency (Hz)

100

80

60

40

20

0

Mag

nitu

de (d

B)

Synthesis ASynthesis B

Figure 5. Above: First five harmonics ofquena. Below: Synthesis results of experi-ments A and B.

0 500 1000 1500 2000

100

80

60

40

20

0

Mag

nitu

de (d

B)

f0 = 445

0 500 1000 1500 2000Frequency (Hz)

100

80

60

40

20

0

Mag

nitu

de (d

B)

f0 = 587

Figure 6. Pitch-shifting of A4 note (above) toD5 (below).

original HWBT analysis coefficients as input to thesynthesis bank. In Figure 6, it can be appreciated themodification of the quena note A4 (f0 = 445 Hz) toD5 (f

′0 = 587 Hz). High-quality pitch-shifted sig-

nals that preserved the acoustical characteristic ofthe original quena sound were obtained.

3.3. Quena γp analysis and comparison

The comparison between quena, melodica andrecorder is established by analysis of the γp fractalcharacteristic (1/f slope) for the fundamental fre-quency f0 of the same musical note played on allinstruments. The parameter γp is computed by lin-ear regression according to Eq. 1. The results areshown in figure 7 for the first harmonic of C5 note(f0 = 523.25 Hz). It was found that the quena γp iscloser in value to the recorder rather than the melod-ica. This result could be supported by the fact that

Page 4: Analysis/Synthesis of the andean Quena via Harmonic Band ...

1 2 3 4 5 6141210

8642024

log 2

(var(b 1,n[m

]))

Harmonic 1 - Left sidebandQ γ1,L=2.68M γ1,L=1.35R γ1,L=2.71

1 2 3 4 5 6wavelet scales n

20

15

10

5

0

5

log 2

(var(b 2,n[m

]))

Harmonic 1 - Right sidebandQ γ1,R=2.26M γ1,R=2.48R γ1,R=3.63

Figure 7. Quena (Q), melodica (M), andrecorder (R) analysis of the left and right side-bands (channels p = 1, 2) for the first harmonicof C5 note with six subbands (wavelet scalesn = 1, . . . , 6).

the first two instruments share closer physical char-acteristics as the tube resonator of the flutes. In con-trast, the melodica analysis returned greater differ-ences respect to quena that could be related to thefree-reeds mechanism, but also due to other distinc-tions like the blowing air pressure and articulation.Based on the 1/f fractal analysis, this findings sup-port our initial hypotheses that γp is a useful param-eter for description of musical instruments with po-tential as a feature for recognition and classificationapplications.

4. ConclusionIn this article it was presented an analysis of thequena by means of two sound transformations viathe Harmonic Band Wavelet Transform and a soundcomparison via the γp fractal parameter with twoother wind instruments. The method described inthis study was not performed before in any andeaninstrument. By means of HBWT it was obtaineda suitable quena sound representation was obtainedfor the deployment of sound transformation appli-cations and sound analysis. Two sound transforma-tions using the pseudo-periodic 1/f fractal modelwere presented. The first one, the Harmonic 1/fnoise filtering was used for 1/f noise isolation oreither the harmonic content reconstruction. The sec-ond one, the pitch-shifting, which is interpreted asa modulation process, can be used to synthesizenew signals with a different fundamental frequency.Both transformations demonstrated acoustically in-teresting results. The last experiment was the com-

parison between the quena, recorder and melodicain terms of the γp fractal parameter. The resultspointed the γp parameter be a promissory attributefor instrument description. The work presentedon this study is part of a research on quena sig-nal analysis and characterization based on its frac-tal attributes. A possible extension to other andeanwoodwind instruments it is being considered for thenear future.

References[1] S. Gunasekaran and K. Revathy. Fractal di-

mension analysis of audio signals for Indianmusical instrument recognition. In 2008 Inter-national Conference on Audio, Language andImage Processing, pages 257–261. IEEE, July2008.

[2] Petros Maragos. Fractal signal analysis us-ing mathematical morphology. In Advances inElectronics and Electron Physics, Volume 88,volume 88, pages 199–246. 1994.

[3] T.Q. Nguyen and R.D. Koilpillai. The the-ory and design of arbitrary-length cosine-modulated filter banks and wavelets, satisfy-ing perfect reconstruction. IEEE Transactionson Signal Processing, 44(3):473–483, March1996.

[4] Pietro Polotti. Fractal additive synthesis: spec-tral modeling of sound for low rate coding ofquality audio. PhD thesis, École polytechniquefédérale de Lausanne, 2003.

[5] Gregory W. Wornell. Wavelet-based representa-tions for the 1/f family of fractal processes. Pro-ceedings of the IEEE, 81(10):1428–1450, 1993.

[6] A. Zlatintsi and P. Maragos. Multiscale Frac-tal Analysis of Musical Instrument Signals WithApplication to Recognition. IEEE Transactionson Audio, Speech, and Language Processing,21(4):737–748, April 2013.