Study on Frequency Domain Primary-Ambient Extraction (PAE) HE Jianjun PhD Candidate, DSP Lab, School...

Study on Frequency Domain Primary-Ambient Extraction (PAE)

HE Jianjun

PhD Candidate, DSP Lab, School of EEE, Nanyang Technological University, Singapore

Email: JHE007@e.ntu.edu.sg

Introduction– PAE based Spatial Audio System

PAEInput

Primary components

Pre-processing

Ambient components

Primaryrendering

Ambient rendering

Spatial attributes

Post- processing

Output

Primary components highly correlated

Ambient components uncorrelated

Primary ambient components uncorrelated

Ambient power balanced

Stereo Signal Model

Signal = Primary + Ambient

: Left channel

: Right channel

Assumptions

R Lkp p

L Ra a

L RP Pa a

( ) ( )L R L Rp a

Stereo Signal Model

Primary panning factor PPF:

RR LL RR LL

Total primary powerPrimary power ratio PPR:

Total signal power

2, [0,1]LR RR LL

r r r k

, : autocorrelation of the left, right channel; : cross correlation between the left and right channelLL RR LRr r r

Center

RightLeft

1/10 10

PAE in full band, time domain

0 0.5 1 1.5 2 2.5 3 3.5 4

200 400 600 800 1000 1200 1400 1600 1800 20000

Compute parameters: k, γ

LR RR LLRR LL RR LL

LR LR RR LL

r r r kr r r rk

r r r r k

k kkkkk

500 1000 1500 2000 2500 3000 3500 4000-2

PAE in full band, frequency domain

0 0.5 1 1.5 2 2.5 3 3.5 4

200 400 600 800 1000 1200 1400 1600 1800 20000

Compute parameters: k, γ

500 1000 1500 2000 2500 3000 3500 4000-2

5001000

15002000

25003000

35004000

5001000

15002000

25003000

35004000

PAE in subband, frequency domain

5001000

15002000

25003000

35004000

5001000

15002000

25003000

35004000

X(f) k(f)

X(f) k represent the panning of the source

AssumptionIn each band, only one source is dominant. The overlapping among the spectrum of different sources shall be minimal.

Correlation computation and time shifting

, 44,44N

xyc x n y n

, 44 0xy

IDFT X f Y fc

IDFT X f Y f

DFTj f N

o Nx n X f e

j f NLpL LpRL L

j f NRR RpL RpR

w f w f eP f X f

X fP f w f e w f

In time domain

In frequency domain

Find the inter-channel time difference (ICTD) arg maxo xyc

Apply ICTD in frequency domain

How to partition the bands?

5001000

15002000

25003000

35004000

5001000

15002000

25003000

35004000

X(f) Ideally, the number of partitions = number of sources

Fixed partitioning: independent of input• Uniform (2, 4, 8, etc.)• Non-uniform (e.g. ERB)

Based on inter-channel cross- correlation coefficient (ICC) φ, Two thresholds: φL , φH

Adaptive partitioning: dependent of input • Top-down• Bottom-up

…Conditions for partition: •φ0 < φH

•max(φ1, φ2) > φ0

•min(φ1, φ2) > φL

Unknown

Multiple (2) sources

Three cases for the directions of two sources:1.At different sides (DS)2.One at the center (C)3.At the same side (SS)

Four ways to synthesize the source direction1.Amplitude panning (AP)2.Time shifting (TS)3.Amplitude panning and time shifting (APTS)4.HRTF filtering (HRTF)

Simulation testing: setup

Primary components: Speech, musicAmbient components: white Gaussian noisePrimary power ratio = 0.9Frame length: 4096Hanning window, 50% overlapping

We test PCA and SPCA with different frequency partitioning•Time domain, full band (Reference)•Uniform partitioning with [1, 2, 4, 8, 16, 32, 64] partitions•Non-uniform partitioning with 20 partitions (Faller, BCC [6])•Top-down (TD) partitioning, with φL = 0.1; φH =0.8

[6] C. Faller, and F. Baumgarte, “Binaural cue coding-part II: schemes and applications,” IEEE Tran. Speech and Audio Processing, vol. 11, no. 6, Nov. 2003.

Performance measure: Error-to-Signal Ratio

( ) 10log .2

L L R R

P P P PL R

P PESR ESR

ESR ESRESR dB

Simulation Results: 1 source

Primary component: speech shifted by 20 lags, panned by k=3.

T 1 2 4 8 16 32 64 20non TDPCA -3.69 -3.72 -3.38 -3.45 -3.34 -3.32 -3.16 -3.19 -3.33 -3.72

SPCA -14.78 -14.85 -12.34 -12.05 -11.52 -11.35 -10.63 -9.30 -10.34 -14.38

1. Generally, SPCA better than PCA.2. The time domain PCA (SPCA) is very close to the frequency domain PCA (SPCA)

when there is only one partition.3. Significant worse performance is found in the frequency domain approaches with

fixed partitioning.4. The performance of the top down partitioning is acceptable.

0 2 4 6 8 10 12 14 16

Frequency partition k

estimated

0 2 4 6 8 10 12 14 160.5

Frequency partition k

estimated

Primary panning factor

Simulation Results: 2 sources

Four ways to synthesize the source direction1.Amplitude panning (AP)2.Time shifting (TS)3.Amplitude panning and time shifting (APTS)4.HRTF filtering (HRTF)

Three cases for the directions of two sources:1.At different sides (DS)2.One at the center (C)3.At the same side (SS)

Simulation Results: 2 sources-AP

1. Generally, the performance of SPCA and PCA is similar.

DS T 1 2 4 8 16 32 64 20non TDPCA -7.95 -8.10 -8.13 -8.22 -8.34 -8.56 -8.56 -9.80 -10.86

SPCA -7.94 -8.15 -8.18 -8.26 -8.36 -8.61 -8.39 -9.33 -9.95 -8.36

C T 1 2 4 8 16 32 64 20non TDPCA -10.15 -10.25 -10.14 -10.22 -10.27 -10.38 -10.34 -11.16 -11.99

SPCA -10.14 -10.33 -10.22 -10.24 -10.30 -10.43 -10.04 -10.29 -10.12 -10.38

SS T 1 2 4 8 16 32 64 20non TDPCA -13.04 -13.10 -11.82 -11.88 -11.75 -11.53 -11.31 -11.40 -11.93

SPCA -13.02 -13.23 -12.04 -11.81 -11.65 -11.56 -10.30 -10.24 -10.52 -13.21

2. The performance is better when the two directions become closer.3. The frequency domain approaches with fixed partitioning show some advantage when the primary components are not in the same side.

4. The frequency domain approach with top down partitioning yields a good performance.

Simulation Results: 2 sources-TS

1. Clearly, SPCA perform better than PCA.

DS T 1 2 4 8 16 32 64 20non TDPCA -5.16 -5.21 -5.23 -5.24 -5.23 -5.23 -5.26 -5.38 -5.47 -5.21

SPCA -7.98 -8.44 -8.43 -8.59 -8.69 -8.73 -8.62 -9.12 -8.58 -8.91

C T 1 2 4 8 16 32 64 20non TDPCA -9.10 -9.14 -9.18 -9.20 -9.18 -9.17 -9.14 -9.18 -9.27 -9.14

SPCA -9.13 -9.60 -9.70 -9.85 -9.97 -9.85 -9.54 -9.91 -9.35 -10.05

SS T 1 2 4 8 16 32 64 20non TDPCA -5.37 -5.38 -5.40 -5.42 -5.41 -5.42 -5.41 -5.44 -5.49 -5.38

SPCA -11.15 -11.65 -11.71 -11.78 -11.94 -11.86 -11.02 -11.20 -9.42 -11.68

2. The performance of SPCA is better when no directions in the center.3. The frequency domain approaches with fixed partitioning show some slightly advantage and does not vary too much in different partitioning.

4. The frequency domain approach with top down partitioning yields the best overall performance.

Simulation Results: 2 sources-APTS

DS T 1 2 4 8 16 32 64 20non TDPCA -5.16 -5.21 -5.23 -5.23 -5.23 -5.23 -5.26 -5.38 -5.47

SPCA -7.99 -8.44 -8.43 -8.59 -8.69 -8.73 -8.62 -9.12 -8.58 -8.91

C T 1 2 4 8 16 32 64 20non TDPCA -8.06 -8.28 -8.19 -8.27 -8.34 -8.46 -8.44 -9.04 -9.55 -8.28

SPCA -8.07 -8.43 -8.38 -8.40 -8.57 -8.63 -8.44 -8.70 -9.07 -8.68

SS T 1 2 4 8 16 32 64 20non TDPCA -4.18 -4.18 -3.95 -3.97 -3.91 -3.92 -3.89 -3.87 -3.98 -4.19

SPCA -10.16 -10.60 -9.89 -9.75 -9.80 -9.77 -9.07 -8.68 -7.29 -10.82

2. The performance is better when two directions are closer.3. The frequency domain approaches with fixed partitioning perform better when the two directions are not in the same side.

4. The frequency domain approach with top down partitioning yields the best overall performance for all cases.

DS T 1 2 8 3220no

0.05,0.8

0.2,0.8

0.2,0.7

0.1,0.7

0.05,0.7

0.05,0.9

0.1,0.9

0.2,0.9

PCA -4.74 -5.04 -5.04 -5.22 -5.48 -6.85 -5.03 -5.03 -5.03 -5.03 -5.03 -5.03 -5.03 -5.03 -5.03

SPCA -6.45 -6.85 -6.85 -7.11 -7.25 -7.73 -6.85 -8.36 -6.85 -6.85 -6.85 -7.93 -8.58 -6.85 -6.85

C T 1 2 8 3220no

0.05,0.8

0.2,0.8

0.2,0.7

0.1,0.7

0.05,0.7

0.05,0.9

0.1,0.9

0.2,0.9

PCA -8.06 -8.28 -8.19 -8.34 -8.44 -9.55 -8.28 -8.44 -8.28 -8.28 -8.28 -8.44 -8.44 -8.27 -8.27

SPCA -8.07 -8.43 -8.38 -8.57 -8.44 -9.07 -8.68 -9.06 -8.44 -8.44 -8.68 -8.58 -9.93 -8.6 -8.44

SS T 1 2 8 3220no

0.05,0.8

0.2,0.8

0.2,0.7

0.1,0.7

0.05,0.7

0.05,0.9

0.1,0.9

0.2,0.9

PCA -4.18 -4.18 -3.95 -3.91 -3.89 -3.98 -4.19 -4.19 -4.19 -4.19 -4.19 -4.19 -4.19 -4.19 -4.19

10.16-

10.60-9.89 -9.80 -9.07 -7.29

-10.82

-10.11

-10.6 -10.6 -10.6-

10.41-8.53

-10.27

DS T 2 8 3220no

n0.05,0.7

TD180.05,0.8

0.2,0.8

0.2,0.7

0.1,0.7

0.05,0.7

0.05,0.9

0.1,0.9

0.2,0.9

PCA -4.74 -5.04 -5.22 -5.48 -6.85 -5.03 -5.03 -5.03 -5.03 -5.03 -5.03 -5.03 -5.03 -5.03 -5.03

SPCA -6.45 -6.85 -7.11 -7.25 -7.73 -7.93 -6.85 -8.36 -6.85 -6.85 -6.85 -7.93 -8.58 -6.85 -6.85

C T 2 8 3220no

n0.05,0.7

TD180.05,0.8

0.2,0.8

0.2,0.7

0.1,0.7

0.05,0.7

0.05,0.9

0.1,0.9

0.2,0.9

PCA -8.06 -8.19 -8.34 -8.44 -9.55 -8.44 -8.28 -8.44 -8.28 -8.28 -8.28 -8.44 -8.44 -8.27 -8.27

SPCA -8.07 -8.38 -8.57 -8.44 -9.07 -8.58 -8.68 -9.06 -8.44 -8.44 -8.68 -8.58 -9.93 -8.6 -8.44

SS T 2 8 3220no

n0.05,0.7

TD180.05,0.8

0.2,0.8

0.2,0.7

0.1,0.7

0.05,0.7

0.05,0.9

0.1,0.9

0.2,0.9

PCA -4.18 -3.95 -3.91 -3.89 -3.98 -4.19 -4.19 -4.19 -4.19 -4.19 -4.19 -4.19 -4.19 -4.19 -4.19

10.16-9.89 -9.80 -9.07 -7.29

-10.41

-10.82

-10.11

-10.6 -10.6 -10.6-

10.41-8.53

-10.27

Simulation Results: 2 sources-HRTF

DS T 1 2 4 8 16 32 64 20non TDPCA -3.28 -3.46 -3.46 -3.44 -3.50 -3.74 -3.77 -3.89 -3.85 -2.47

SPCA -6.49 -6.07 -6.08 -6.13 -6.42 -7.33 -5.70 -5.56 -5.71 -6.52

C T 1 2 4 8 16 32 64 20non TDPCA -6.96 -7.16 -7.09 -7.12 -7.21 -7.37 -7.50 -7.60 -7.65 -7.16

SPCA -7.41 -7.97 -7.89 -7.92 -7.87 -8.33 -6.97 -6.15 -7.17 -8

SS T 1 2 4 8 16 32 64 20non TDPCA -1.14 -1.26 -1.14 -1.15 -1.12 -1.06 -1.04 -1.07 -1.19 -1.88

SPCA -6.58 -6.81 -6.48 -6.48 -6.70 -7.44 -2.51 -2.44 -2.90 -6.98

2. The performance of SPCA is better when one direction is in the center.3. The frequency domain approaches with fixed partitioning show better performance only in some partitionings.

Summary of Simulation Results: 2 sources

2. Generally, SPCA perform better than PCA in almost all cases.

3. The performance varies as the directions of the sources change.

4. The frequency domain approaches with fixed partitioning cannot always give a better performance.

5. The frequency domain approach with top down partitioning approach yields the best overall performance in most of the cases.

1. The overall performance of PAE: AP > TS > APTS > HRTF.

How about perceptual testing?Usually only one source (speech) is better extracted. Because the spectrum of speech (as compared to music) is more focused in certain bands.

Conclusions and thoughts

2. Many considerations should go to the partitioning of the frequency bands.

4. The performance of PAE in frequency domain with fixed partitioning is not consistent as the directions of the sources change and the number of partitions changes.

3. Generally, SPCA outperforms PCA in almost all cases.

5. The frequency domain approach with top down partitioning approach yields some promising results in the performance in most of the cases.

1. A study of frequency domain PAE with different partitioning is conducted. It is targeted for multiple primary components that come from different directions concurrently. Two PAE approaches tested are PCA and shifted PCA.

Thoughts:A more robust partitioning is required! How to determine the threshold for other input signals.Need more accurate estimation of primary panning factor, and ICTD/ICC.How about other PAE approaches such as least squares ?

Study on Frequency Domain Primary-Ambient Extraction (PAE) HE Jianjun PhD Candidate, DSP Lab, School...

Documents

Transcript of Study on Frequency Domain Primary-Ambient Extraction (PAE) HE Jianjun PhD Candidate, DSP Lab, School...

PAE Dolor Agudo

Pae I Pae ame Cmment

Pae ‘Äina Sustainability Grants · have shown comparisons between the Grantee’s final productivity. ... Pae ʻÄina ... Pae ‘ Äina Sustainability Grants FY2014 to FY2015 -

PAE PURPURA TROMB.docx

COALGEBRAIC STRUCTURE OF GENETIC INHERITANCE Jianjun …

PAE Dolor Lumbar

Pae Pancreatitis Completo

PAE X PAE - cdn.kqed.org

pae tumor facial

Texaco Pae Manaus

Wai O Pae - University of HawaiiWai O Pae. Wai O Pae. Wai O Pae. Wai O Pae. Title: Slide 1 Author: Joanna Philippoff Created Date: 5/12/2007 11:02:54 PM ...

INVESTOR OVERVIEW - PAE

PAE games.docx

Pae Pielonefritis

Wu Jingshui Lu Jianjun - TodayIR

Pae tū, pae ora – Pathways to pae ora€¦ · Citation: Pae tū, pae ora – Pathways to pae ora Published in January 2020 by the Ministry of Health PO Box 5013, Wellington 6140,

Nanyang Bulletin

Pae presentacion 2010

PAE Adolescente

Contingency Planning & Management - Nanyang ... Planning & Management - Nanyang ...