T.-H. Tsai and Y.-C. Yang

Low power and cost effective VLSI design for an MP3 audio decoder using an optimized synthesis-subband approachT.-H. Tsai and Y.-C. YangDepartment of Electrical Engineering and National Central University, Taiwan ROCIEE Proceedings on Computers and Digital Techniques

AbstractAn optimized approach to MPEG layer-3(MP3) audio decoding is presented, with the main theme focused on the synthesis subband. Since the synthesis subband is the most power-consuming component in decoding, a cost-effective architecture is proposed based on a system-design consideration. By means of an algorithm and architecture, the synthesis subband archives a high throughput with reduced memory requirements and hardware complexity. With a two-stage pipeline architecture, it allows 100% hardware utilization and is suitable for low-power implementation. In addition, the chip design in a 0.35um process is also accomplished. It occupies a die area of about 2.7 3.2 mm2 with a transistor count of 157,469 and a low-power dissipation of only 2.92mW

Whats the problemMPEG layer-3(MP3) coding has been widely applied to current digital audio broadcasting and multimedia applicationA cost-effective and low-power implementation will largely reduce the hardware and computation complexityFrom the MP3 decoder point of view, the computational load depends on the realization of a synthesis subband

OutlineIntroduction of synthesis subbandImplementation considerations and analysisProposed method and architectureResults and comparisonConclusion

Introduction(1)Elementary concept of MP3Multirate subband-based coding techniquesIn the encoder, it performs analysis subband filtering with 32 equally spaced filterbanks based on a psychoacoustical modelIn the decoder, it performs synthesis subband filteringMost fast algorithms techniques interpret synthesis subband filtering as a modified discrete cosine transform (MDCT) with some additional windowing operations

Introduction(2)One of the popular methodTranslate DCT into a FFT kernelAdvantageBecause of FFT equations specific symmetric and recursive property, we can reduce the number of multiplications and additionsDisadvantagethese methods have complex control and irregular data flow which will introduce a high hardware costThe proposed designreduced memory requirements and hardware complexityHigh efficiency with 100% hardware utilization using a two-stage pipeline architecture

Introduction(3)MP3 decoding flowHybrid filter bank divided into inverse modified discrete cosine transfer with dynamic windowing and overlap (DWIMDCT), and the synthesis subband filterbank

Start

Get bit streamFind Header

Decode Side information

Decode Scale factors

Decode Huffman data

Requantize Spectrum

Reorder Spectrum

Joint Stereo Processing(if necessary)

Alias reduction

IMDCT and windowing

Sub-band synthesis

Output PCM samples

Introduction(4)Synthesis-subband decoding flow

Implementation analysisDesign targetDelivering the required high performance at the minimum cost and the smallest silicon areaThe performance is determined by real-time constraints

Implementation analysis (cont.)MOPS = Fs C NFsSample frequencyCTotal number of numerical calculations per sampleNnumber of audio channel

Implementation considerationIn synthesis subband, IMDCT can be broken into an FFT, a data shift, preprocessing and post-processingThree considerationsThe initial transformer, the real-number computation is also translated into the complex number computationData shift, preprocessing and post-processing still contain complex multiplicationsFFT algorithms always need many multipliers, and the butterfly recursive process leads to some complex interconnection and routing

Proposed methodNormal IMDCTProposed IMDCT

Require about amount of multiplier-accumulate computationsRequired size for the ram buffer can be reduced to only 512 words per channel( amount of original)

Architecture IMDCTIPQMF

Architecture (cont.)Pipeline architecture

Memory configuration (1)

Memory configuration (2)Data conflicts in IMDCT and IPQMF

Memory configuration (3)Memory data access with pipeline operation

Results and comparison (1)

ConclusionBy means of novel algorithm and architecture, the synthesis subband has a better performanceIt also archives a high throughput, with a low-cost memory requirement and hardware complexity

Sub-band samples(32 subband x 18 samples)

0 1 2 16 17

01...3031

IMDCT

0 1 2 62 63

031

3263

6495

96127

128159

160191

16 x 64-bitFIFO= 1024 samples

896927

928959

960991

9921023

0

1

2

14

15

031

031

3263

3263

6495

6495

480511

480511

U vector

D window

x

x

x

x

031

031

031

031

w0

w1

w2

w15

+

+

+

+

=Sum(w0 ~w15)

T.-H. Tsai and Y.-C. Yang

Documents

Transcript of T.-H. Tsai and Y.-C. Yang