PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology...

Post on 17-Dec-2015

218 views 1 download

Transcript of PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology...

PAC/AAC audio coding standard

A. Morenoantonio@ece.gatech.eduGeorgia Institute of TechnologyECE8873-Spring/2004

Overview

Audio Recording Coding-ultimate goal AAC Encoder Block Diagram Principles of Psychoacoustics Perceptual Entropy Quantization and Coding Samples

Introduction

"If a tree falls in the forest with no one around to hear it, does it make a sound?"

Audio Recording

Edison, 1877

Audio Recording

Philips, 1978

A/D Converter

PCM

Coding

Ultimate Goal: reduce the number of bits needed to represent the data.

Bitrate = Fsa x Wordlength

AAC Encoder Block Diagram

Perceptual Model

Gain Control MDCT TNS

Multi-ChannelM/S, Intensity Prediction z^-1

Quant

ScaleFactorExtract

Iterative Rate Control Loop

EntropyCoding

Side information coding, Bitstreamchannel

s(n)

Principles of Psychoacoustics

Source localization.

Two ears are necessary.

Brain uses intensity differences, and time delays between the two perceived signals.

Principles of Psychoacoustics

inaudible

audible

Absolute Hearing Threshold

Principles of Psychoacoustics

Human Ear Loudness characteristic

Robinson and Dadson equi-loudness contours.

Principles of Psychoacoustics Critical Bands

Concept introduced by Harvey Fletcher 1940.

Frequency to Place Transform.Function of frequency that quantifies the cochlear filter passbands.

Example: The critical band for a 1kHz is about 160Hz in width. A narrow band noise centered at 1kHz is perceived with the same loudness as long as the width < 160Hz.

(Hz)])1000/(4.11[7525)( 69.02ffBWc

Principles of Psychoacoustics

Simultaneous Masking: Frequency

inaudible

audible

Principles of Psychoacoustics

BETH TN 5.14

Simplified Paradigms:Noise Masking Tone

Tone Masking Noise

1Bark

THN

1Bark

THTKETH NT

K=3dB...5dB (constant)

Principles of Psychoacoustics

1Bark

th

Spread of Masking

Principles of Psychoacoustics

Masking: Temporal

Perceptual Entropy Perceptual Entropy, objective metric of

perceptually relevant introduced by J. Johnston

The perceived information from an audio signal is only a fraction of the total information emanated by the source.

Perceptual Entropy

Procedure:1. Window and transform to frequency.2. Masking Threshold is computed using

perceptual rules3. A determination is made of the

number of bits required to quantize the spectrum, without injecting perceptible noise.

Perceptual Entropy

a

gSFM

)1,60

min( dB

SFM

)dB(5.5)1()5.14( iOi

s(n) HannWindow

MDCTDetermine nature

(Noise-like)(Tone-like)

ApplyThresholding

rules

)10/()(10log10 ii OCiT

Spectral Flatness Measure

Coefficient of ‘Tonality’

Offset

JND Estimates

Perceptual Entropy

25

1e)bits/sampl(1)

/6

)Im(int(22log1)

/6

)Re(int(22log

i

bh

blwiiii

i

i kT

wn

kT

wnPE

i: index of critical band;bli, blh: lower and upper bounds of band i;ki: number of transform component in band i;Ti: masking threshold in band i;nint: rounding to the nearest integer.

Returning

"If a tree falls in the forest with no one around to hear it, does it make a sound?"

From a Perceptual Coding standpoint, if no one can hear it, THERE IS NO TREE.

AAC Encoder Block Diagram

Perceptual Model

Gain Control MDCT TNS

Multi-ChannelM/S, Intensity Prediction z^-1

Quant

ScaleFactorExtract

Iterative Rate Control Loop

EntropyCoding

Side information coding, Bitstreamchannel

s(n)

Quantization and Coding

Power-law quantizer Huffman Coding (table can be chosen)

Global Gain -> Quantization step size Scale Factors -> noise shaping factor

Quantization and Codingwhile NOISE_CTL

while FINDING_RATENr_bits= get_bits_needed();if (Nr_bits > max_bits)

adjust_global_gain();else

FINDING_RATE=0;endq_noise=get_quant_noise_level();if (q_noise> Th(band))

adjust_band_scale_factor();else

NOISE_CTL=0;end

Samples

Castanets

Original 48kHz Stereo

128kbps AAC Stereo (48kHz)

Piano

Timpani

References[1] Ted Painter and Andreas Spanias. Perceptual coding

of digital audio. Proceedings of the IEEE, 88(4):449-513. Abril 2000.

[2] Karlheinz Brandenburg, MP3 and AAC explained, AES 17th International Conference on High Quality Audio Coding, 1999.

[3] J.D. Johnston, A.J. Ferreira, Sum-Difference Stereo Transform Coding, Proc. ICASSP 1992.

[4] Deepen Sinha, James D. Johnston. Audio Compression at low bit rates using a Signal Adaptive switched Filterbank. Proc. of the ICASSP 1996, pp. 1053-1056 .