Developement and Implementation of an MPEG1 Layer III Decoder on x86 and TMS320C6711 platforms
-
Upload
maya-saunders -
Category
Documents
-
view
20 -
download
6
description
Transcript of Developement and Implementation of an MPEG1 Layer III Decoder on x86 and TMS320C6711 platforms
Developement and Implementation of an
MPEG1 Layer III Decoder on x86 and TMS320C6711
platformsFarina Simone(Braidotti Enrico)
DECODING PROCESS
RetrievingFile
Information
HuffmanDecoding
Requantization
StereoProcessing
ReorderingAlias
Reconstruction
Hybrid Synthesis (IMDCT,
Windowing, Overlap-Add)
Frequency Inversion
Synthesis Polyphase Filterbank
Input File
PCM output samples
RetrievingFile
Information
HuffmanDecoding
Requantization
StereoProcessing
ReorderingAlias
Reconstruction
Hybrid Synthesis (IMDCT,
Windowing, Overlap-Add)
Frequency Inversion
Synthesis Polyphase Filterbank
Input File
PCM output samples
ALIAS RECONSTRUCTION
It is performed only when using long blocks: this means only when using pure long blocks or mixed blocks.
Lets see what long/short blocks are
Not encoded signal
Same signal encoded using long blocks
Same signal encoded using short blocks
HYBRID SYNTHESIS
• IMDCT (Inverse Modified Discrete Cosine Transform)
Subbands are backward transformed separately depending on block length.
• 6-point IMDCT
When short blocks are used (pre-echoes masking)
• 18-point IMDCT
When long blocks are used
HYBRID SYNTHESIS
• Fast IMDCT algorithm (Szu-Wei Lee )
Based on simmetric properties of cosine function
It needs a rearranging stage to restore values to their originalpositions
Drastically reduces number of operations if compared to direct
implementation
× (short/long) + (short/long)
Direct Implementation 216 / 648 180 / 612
Fast IMDCT (Szu-Wei Lee) 33 / 43 69 / 115
Improvement 84.7 % / 93.3 % 61.7 % / 81.2 %
HYBRID SYNTHESIS
• Windowing
Once transformed, subbands are windowed according to value of block_type (subbands with short blocks are separately transformed for each window and then overlapped)
• Overlap-adding
First half of transformed blocks is overlapped with second half of the corresponding blocks in the previous granule
FREQUENCY INVERSION
Every second sample in every second subband has to be multiplied by -1.
SYNTHESIS POLYPHASE FILTERBANK
This process produces 32 PCM audio samples.
• 576 / granule
• 1152 / frame (equal to 26 ms of audio @ 44,1 kHz)
Composed of several steps, it turns out to be the most time-consuming stage of the overall decoding process
SYNTHESIS POLYPHASE FILTERBANK
• Polyphase Matrixing
It is a cosine-like transform (non standard )
The direct computation involves a 64×32 matrix and requires almost ¼ of decoding time
Needs optimization to perform real-time decoding
• K. Konstantinides’ algorithm
• 32-point Fast DCT (B.G.Lee)
SYNTHESIS POLYPHASE FILTERBANK
• Konstantinides’ Algorithm
SYNTHESIS POLYPHASE FILTERBANK
• FCT Algorithm (Byeong-Gi Lee )
Using trigonometrical properties a 2M DCT can be performed by 2M-1 2-point DCTs
Direct computation
× = N ²
+ = N· ( N-1 )
FCT
× = N/2 · log2 ( N )
+ < 3· N/2 · log2 ( N )
WAVE STANDARD
• Individuated by a 44-byte header, holds information about:
• sampling frequency• number of channels• . . .
• Uncompressed PCM audio samples (normally with 16 bits/sample resolution) stored in following way:
Istante dicampionamento
Canale
0 1 (Left)
2 (Right)
1 1 (Left)
2 (Right)
2 1 (Left)
2 (Right)
PERFORMANCE ANALYSIS• PC Performances
The decoder, without optimization, works in real time on the following CPUs:
The decoder, with optimization , reaches 17,5× on Pentium IV CPU
PERFORMANCE ANALYSIS• C6711 DSK Performances
The decoder, without optimization, doesn’t work in real time on the board
• Parallel port is used for data transfer and it’s very very slow
• Most algorithms need optimization (only Huffman Decoding is optimized)
• Code needs some ASM optimization to use the full-potential of the board architecture
Whole decoding process (except data transfer TO external hard disk) takes about 10 times more than needed to work in real time. With optimization it is an easy goal to reach.
PERFORMANCE ANALYSIS• Time-occupation of optimized processes on C6711 DSK:
PERFORMANCE ANALYSIS• Time-occupation of optimized processes on C6711 DSK:
PERFORMANCE ANALYSIS• Time-occupation of other processes on C6711 DSK: