Project: IEEE P802.15 Working Group for Wireless Personal Area Networks (WPANs)
Project: IEEE P802.15 Working Group for Wireless Personal Area Networks (WPANS) Submission Title:...
-
Upload
xavier-hayes -
Category
Documents
-
view
213 -
download
0
Transcript of Project: IEEE P802.15 Working Group for Wireless Personal Area Networks (WPANS) Submission Title:...
Project: IEEE P802.15 Working Group for Wireless Personal Area Networks (WPANS)
Submission Title: [Implementation of High Speed FFT processor for MB-OFDM System]
Date Submitted: [September 2004]Revised: []
Source: [Sang-sung Choi, Sang-in Cho] Company [Electronics and Telecommunications Research Institute] Address [161 Gajeong-dong, Yuseong-gu, Daejeon, 305-350 Korea] Voice : [+82-42-860-6722], FAX : [+82-42-860-5199], E-mail [[email protected]]
Re: [Technical contribution]
Abstract: [This presentation presents the implementation method of IFFT/FFT processor for MB-OFDM UWB system]
Purpose: [Technical contribution to implement IFFT/FFT processor proposed for MB-OFDM UWB system]
Notice: This document has been prepared to assist the IEEE P802.15. It is offered as a basis for discussion and is not binding on the contributing individual or organization. The material in this document is subject to change in form and content after further study. The contributor reserves the right to add, amend or withdraw material contained herein.
Release: The contributor acknowledges and accepts that this contribution becomes the property of IEEE and may be made publicly available by P802.15.
September 2004 doc.: IEEE 802. 15-04-0467-00-003a
September 2004
ETRISlide 2Submission
doc.: IEEE 802.15-04-0467-00-003a
Implementation of High Speed FFT processor
for MB-OFDM System
Sang-Sung Choi ([email protected])
Sang-In Cho ([email protected])
E T R I
www.etri.re.kr
September 2004
ETRISlide 3Submission
doc.: IEEE 802.15-04-0467-00-003a
Introduction MB-OFDM UWB proposal requires high speed IFFT/FFT processors with
128-point computation.
Digital signals processed in IFFT processor change into analog signals
by DAC, and then pass through the sharp LPF to satisfy the transmitting
PSD mask.
- Transmitter using 128-point IFFT processor (DAC speed : 528MHz)
- LPF shape & Frequency spectrum of OFDM signal after DAC
128 point IFFT
S/P P/S
D/A LPF
128-pointComplex data
528Msps528Msps
1056 f [MHz]8.25
012.375
528
Conventional Filter shape
Desired Filter shape
September 2004
ETRISlide 4Submission
doc.: IEEE 802.15-04-0467-00-003a
The TX LPF is very important to determine the transmit PSD mask of MB-OFDM UWB system, but the TX LPF design is not easy to satisfy the Transmit PSD mask of MB-OFDM.
Two methods are considered to design the TX LPF satisfying the transmit PSD mask.
1) fix 528MHz sampling rate of DAC , and design high order TX LPF
2) increase sampling rate of DAC, and reduce the order of TX LPF
Use 2 times over-sample rate at DAC to design the TX LPF.
- Reduce the order of TX LPF
- It has advantage of the performance compared to method 1).
Presented by DOC IEEE802.15-03/275r0
There are trade-offs between two methods for considering power
consumption and gate size etc.
ETRI is developing a prototype UWB system using 256-point IFFT
processor (DAC speed : 1056MHz)
Introduction
September 2004
ETRISlide 5Submission
doc.: IEEE 802.15-04-0467-00-003a
Proposed IFFT Processor Approach For easy low pass filtering of 528MHz baseband signal after DAC, we
have to make space between OFDM signals that are repeated in frequency
spectrum, which is accomplished by 128-point zero-padding.
- Transmitter using 256-point IFFT processor (DAC speed : 1056MHz)
- LPF shape & Frequency spectrum of OFDM signal after DAC
256 point IFFT
S/P P/S
D/A LPF
128-point complex data+ 128 zeros
1056Msps528Msps
1056 f [MHz]8.25
012.375
Conventional Filter shape
Desired Filter shape
September 2004
ETRISlide 6Submission
doc.: IEEE 802.15-04-0467-00-003a
Input data of FFT processor are QPSK modulated 128-point complex
data
Input data of IFFT processor become 256-point that consisted of QPSK
modulated 128-point complex data and 128-point zeros.
- Input data of original IFFT processor (128-point QPSK data)
- Input data of proposed IFFT processor (128-point QPSK data + 128-point zeros)
0 1 2 3 ……………………… 63 64 …………………… 126 127 -64 …………………… -3 -2 -1 0 1 2 3 ……………………… 63 -64 …………………… -3 -2 -1 0 1 2 3 ……………………… 63
f
Frequency Domain
0 1 2 3 ……………………… 63 64 …………………… 126 127 128 129 ……………………… 191 192 …………………… 254 255
-64 …………………… -3 -2 -1 0 1 2 3 ……………………… 63 -64 ………………… -3 -2 -1 0 1 2 3 ……………………… 63
f
Frequency Domain
0 1 2 3 …………………… 63 64 ………………………… 126 127
n128
128-point complex data
0
n128 256
256-point complex data
0
IFFT/FFT Processor Specification
September 2004
ETRISlide 7Submission
doc.: IEEE 802.15-04-0467-00-003a
Proposed transceiver for MB-OFDM UWB PHY proposal
256-pointIFFT
128-pointFFT
AddZP,
Guardinterval
RemoveZP,
Guardinterval
Time DomainFreq. Domain Freq. Domain
DAC ADC
UWB Channel
TX LO RX LOClipping
Level
1056 MHz 528 MHz
Input : 4 samples/clock
Output : 8 samples/cloc
k Input : 4 samples/clock
Output : 4 samples/clock
September 2004
ETRISlide 8Submission
doc.: IEEE 802.15-04-0467-00-003a
Characteristics of MultipliersMultiplier is one of the most dominant elements
in FFT/IFFT implementation– Standard 2’s Complement Multiplier
• (W-bit) x (W-bit) = (2W-1)-bit• Many DSP applications need only W-bit products
– Fixed-Width Multiplier• Quantization to W-bit by eliminating (W-1) Least
Significant Bits • Can reduce area by approximately 50% but Truncation
Error is introduced• Proper Error Compensation Bias needed
– Canonic Signed Digit Multiplier
• Constant coefficient• 33% fewer nonzero digits than 2’s complement numbers
– Modified Booth Multiplier• Variable coefficient• The number of partial products has been reduced to W/2• These multipliers can achieve about 40% reduction in area
and power consumption
September 2004
ETRISlide 9Submission
doc.: IEEE 802.15-04-0467-00-003a
The radix-24 structure of FFT processor
Butterfly(1)
16 Delay TWF
Butterfly(2)
8 Delay TWF
Butterfly(3)
4 Delay TWF
Butterfly(4)
2 Delay TWF
Butterfly(5)
1 Delay TFW
1 2 3 4 52 4 8 16
1 2 3 4 52 4 8 16
N N N Nn n n n n n
k k k k k k
16
5 4 3 2 1
1 1 1 1 1
1 2 3 4 5 1 2 3 4 52 4 8 160 0 0 0 0
( 2 4 8 16 )
N
nkN N N NN
n n n n n
X k k k k k x n n n n n W
1
0
( ) ( )N
knN
n
X k x n W
DFT :
Butterfly(1)
16 Delay W32
Butterfly(2)
8 Delay - j
Butterfly(3)
4 Delay TWF
Butterfly(4)
2 Delay - j
Butterfly(5)
1 Delay W16
Radix-2 structure
Radix-24 structure
CSD multiplier
CSD multiplier
Modified Booth multiplier
September 2004
ETRISlide 10Submission
doc.: IEEE 802.15-04-0467-00-003a
The structure of 256-point IFFT processor
32-point Radix-24 FFT – P5
32-point Radix-24 FFT – P6
32-point Radix-24 FFT – P7
32-point Radix-24 FFT – P8
S/P
Bit Reverse
P/S
InputData
Output Data
32-point Radix-24 FFT - P1
32-point Radix-24 FFT - P2
32-point Radix-24 FFT - P3
32-point Radix-24 FFT - P4
(124) . . . (8)(4)(0)
(125) . . . (9)(5)(1)
(126) . . . (10)(6)(2)
(127) . . . (11)(7)(3)
32-point Radix-24 FFT structure 8-level parallelism DIF (Decimation In Frequency), SDF (Single Delay Feedback) Fixed CSD & Modified Booth multipliers used
September 2004
ETRISlide 11Submission
doc.: IEEE 802.15-04-0467-00-003a
The structure of 256-point IFFT processor
Butterfly(1)
16 Delay W32
Butterfly(2)
8 Delay - j
Butterfly(3)
4 Delay TWFROM
Butterfly(4)
2 Delay - j
Butterfly(5)
1 Delay W16
Butterfly(1)
16 Delay W32
Butterfly(2)
8 Delay - j
Butterfly(3)
4 Delay TWFROM
Butterfly(4)
2 Delay - j
Butterfly(5)
1 Delay
Butterfly(6)
W16
Butterfly(7)
Butterfly(1)
16 Delay W32
Butterfly(2)
8 Delay - j
Butterfly(3)
4 Delay TWFROM
Butterfly(4)
2 Delay - j
Butterfly(5)
1 Delay W16
Butterfly(1)
16 Delay W32
Butterfly(2)
8 Delay - j
Butterfly(3)
4 Delay TWFROM
Butterfly(4)
2 Delay - j
Butterfly(5)
1 Delay
Butterfly(6)
W16
- j
Butterfly(7)
Bit Reverse
Unit
Butterfly(1)
16 Delay W32
Butterfly(2)
8 Delay - j
Butterfly(3)
4 Delay TWFROM
Butterfly(4)
2 Delay - j
Butterfly(5)
1 Delay W16-j
Butterfly(1)
16 Delay W32
Butterfly(2)
8 Delay - j
Butterfly(3)
4 Delay TWFROM
Butterfly(4)
2 Delay - j
Butterfly(5)
1 Delay W16-j
Butterfly(1)
16 Delay W32
Butterfly(2)
8 Delay - j
Butterfly(3)
4 Delay TWFROM
Butterfly(4)
2 Delay - j
Butterfly(5)
1 Delay W16-j
Butterfly(1)
16 Delay W32
Butterfly(2)
8 Delay - j
Butterfly(3)
4 Delay TWFROM
Butterfly(4)
2 Delay - j
Butterfly(5)
1 Delay W16-j
Butterfly(6)
Butterfly(7)
Butterfly(6)
- j
Butterfly(7)
…x(8)x(4)x(0)
…x(9)x(5)x(1)
…x(10)x(6)x(2)
…x(11)x(7)x(3)
…v(8)v(4)v(0)
…v(10)v(6)v(2)
…v(11)v(7)v(3)
…v(9)v(5)v(1)
…v`(8)v`(4)v`(0)
…v`(10)v`(6)v`(2)
…v`(11)v`(7)v`(3)
…v`(9)v`(5)v`(1)
…X(48)X(16)X(32)X(0)
…X(176)X(144)X(160)X(128)
…X(112)X(80)X(96)X(64)
…X(240)X(208)X(224)X(192)
…X(49)X(17)X(33)X(1)
…X(177)X(145)X(161)X(129)
…X(113)X(81)X(97)X(65)
…X(241)X(209)X(225)X(193)
Butterfly unit : 48 -j multiplier : 22 CSD multiplier : 16 Modified Booth Multiplier : 8
September 2004
ETRISlide 12Submission
doc.: IEEE 802.15-04-0467-00-003a
The structure of 256-point IFFT processor
Butterfly(Type 1)
16 Delay W32
Butterfly(Type 1)
8 Delay
Butterfly(Type 2)
4 Delay TWFROM
Butterfly(Type 1)
2 Delay
Butterfly(Type 2)
1 Delay W16
Butterfly(Type 1)
16 Delay W32
Butterfly(Type 1)
8 Delay
Butterfly(Type 2)
4 Delay TWFROM
Butterfly(Type 1)
2 Delay
Butterfly(Type 2)
1 Delay
Butterfly(Type 1)
W16
Butterfly(Type 1)
Butterfly(Type 1)
16 Delay W32
Butterfly(Type 1)
8 Delay
Butterfly(Type 2)
4 Delay TWFROM
Butterfly(Type 1)
2 Delay
Butterfly(Type 2)
1 Delay W16
Butterfly(Type 1)
16 Delay W32
Butterfly(Type 1)
8 Delay
Butterfly(Type 2)
4 Delay TWFROM
Butterfly(Type 1)
2 Delay
Butterfly(Type 2)
1 Delay
Butterfly(Type 1)
W16
Butterfly(Type 2)
Bit Reverse
Unit
Butterfly(Type 2)
16 Delay W32
Butterfly(Type 1)
8 Delay
Butterfly(Type 2)
4 Delay TWFROM
Butterfly(Type 1)
2 Delay
Butterfly(Type 2)
1 Delay W16
Butterfly(Type 2)
16 Delay W32
Butterfly(Type 1)
8 Delay
Butterfly(Type 2)
4 Delay TWFROM
Butterfly(Type 1)
2 Delay
Butterfly(Type 2)
1 Delay W16
Butterfly(Type 2)
16 Delay W32
Butterfly(Type 1)
8 Delay
Butterfly(Type 2)
4 Delay TWFROM
Butterfly(Type 1)
2 Delay
Butterfly(Type 2)
1 Delay W16
Butterfly(Type 2)
16 Delay W32
Butterfly(Type 1)
8 Delay
Butterfly(Type 2)
4 Delay TWFROM
Butterfly(Type 1)
2 Delay
Butterfly(Type 2)
1 Delay W16
Butterfly(Type 1)
Butterfly(Type 1)
Butterfly(Type 1)
Butterfly(Type 2)
…x(8)x(4)x(0)
…x(9)x(5)x(1)
…x(10)x(6)x(2)
…x(11)x(7)x(3)
…v(8)v(4)v(0)
…v(10)v(6)v(2)
…v(11)v(7)v(3)
…v(9)v(5)v(1)
…v`(8)v`(4)v`(0)
…v`(10)v`(6)v`(2)
…v`(11)v`(7)v`(3)
…v`(9)v`(5)v`(1)
…X(48)X(16)X(32)X(0)
…X(176)X(144)X(160)X(128)
…X(112)X(80)X(96)X(64)
…X(240)X(208)X(224)X(192)
…X(49)X(17)X(33)X(1)
…X(177)X(145)X(161)X(129)
…X(113)X(81)X(97)X(65)
…X(241)X(209)X(225)X(193)
…x(8)x(4)x(0)
…x(9)x(5)x(1)
…x(10)x(6)x(2)
…x(11)x(7)x(3)
CSD multiplier
Modified Booth
multiplier
CSD multiplier
September 2004
ETRISlide 13Submission
doc.: IEEE 802.15-04-0467-00-003a
The structure of 128-point FFT processor
32-point Radix-24 FFT - P1
32-point Radix-24 FFT - P2
32-point Radix-24 FFT - P3
32-point Radix-24 FFT - P4S/P
Bit Reverse
P/SInput
DataOutput Data
(124) . . . (8)(4)(0)
(125) . . . (9)(5)(1)
(126) . . . (10)(6)(2)
(127) . . . (11)(7)(3)
32-point Radix-24 FFT structure 4-level parallelism DIF (Decimation In Frequency), SDF (Single Delay Feedback) Fixed CSD & Modified Booth multipliers used
September 2004
ETRISlide 14Submission
doc.: IEEE 802.15-04-0467-00-003a
The structure of 128-point FFT processor
Butterfly(Type 1)
16 Delay W16
Butterfly(Type 1)
8 Delay
Butterfly(Type 2)
4 Delay TWFROM
Butterfly(Type 1)
2 Delay
Butterfly(Type 2)
1 Delay W16
Butterfly(Type 1)
16 Delay W16
Butterfly(Type 1)
8 Delay
Butterfly(Type 2)
4 Delay TWFROM
Butterfly(Type 1)
2 Delay
Butterfly(Type 2)
1 Delay
Butterfly(Type 1)
W16
Butterfly(Type 1)
Butterfly(Type 2)
16 Delay W16
Butterfly(Type 1)
8 Delay
Butterfly(Type 2)
4 Delay TWFROM
Butterfly(Type 1)
2 Delay
Butterfly(Type 2)
1 Delay W16
Butterfly(Type 2)
16 Delay W16
Butterfly(Type 1)
8 Delay
Butterfly(Type 2)
4 Delay TWFROM
Butterfly(Type 1)
2 Delay
Butterfly(Type 2)
1 Delay
Butterfly(Type 1)
W16
Butterfly(Type 2)
Bit Reverse
Unit
…x(8)x(4)x(0)
…x(9)x(5)x(1)
…x(10)x(6)x(2)
…x(11)x(7)x(3)
…v(8)v(4)v(0)
…v(10)v(6)v(2)
…v(11)v(7)v(3)
…v(9)v(5)v(1)
…X(8)X(16)X(0)
…X(72)X(80)X(64)
…X(40)X(48)X(32)
…X(104)X(112)X(96)
CSD multiplier
Modified Booth
multiplier
CSD multiplier
Butterfly unit : 24 -j multiplier : 11 CSD multiplier : 8 Modified Booth Multiplier : 4
September 2004
ETRISlide 15Submission
doc.: IEEE 802.15-04-0467-00-003a
Simulation result of 256-point IFFT processor
128-point QPSK modulated data
+128-point ‘0’
Decimation (2)
MATLAB FunctionFFT(128)
256-point IFFTRadix-24 DIF SDF
(Fixed point)
SQNR calculation
Input Bit resolution : 3 Output bit resolution : 20 Multiplier coefficient bit : 10 SQNR : 52dB
Input Bit resolution : 3 Output bit resolution : 11 Multiplier coefficient bit : 8 SQNR : 30dB
Constellation
September 2004
ETRISlide 16Submission
doc.: IEEE 802.15-04-0467-00-003a
Simulation result of 128-point FFT processor
128-point QPSK modulated data
+128-point ‘0’
Decimation (2)
128-point FFTRadix-24 DIF SDF
(Fixed point)
MATLAB FunctionIFFT(256)
SQNR calculation
Input Bit resolution : 10 Output bit resolution : 20 Multiplier coefficient bit : 10 SQNR : 52dB
Input Bit resolution : 10 Output bit resolution : 12 Multiplier coefficient bit : 8 SQNR : 30dB
Constellation
September 2004
ETRISlide 17Submission
doc.: IEEE 802.15-04-0467-00-003a
Summary of simulations
Points Parallel level SQNR (dB) Gate Count
2568 52 about 100k
8 30 about 80k
Points Parallel level SQNR (dB) Gate Count
1284 52 about 50k
4 30 about 40k
FFT processor
IFFT processor
September 2004
ETRISlide 18Submission
doc.: IEEE 802.15-04-0467-00-003a
Conclusion
256-point IFFT processing for easy Low Pass FilteringParallel structure for high speed signal processingIFFT/FFT processor 32-point radix-24 DIF SDF structure Small area, low power, high speed operation Canonic Signed Digit Multiplier – constant coefficients Modified Booth Multiplier – variable coefficients
IFFT processor FFT processor
Point 256-point 128-point
Parallelism 8 4
Number of input data (sample/clock) 4 4
Throughput (sample/clock) 8 4
Latency (except S/P, reverse unit) 32 32
Number of gates (30dB SQNR) About 80K About 40K