05782974.pdf

6
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 7, JULY 2012 1161 A Low-Power Low-Cost Design of Primary Synchronization Signal Detection Chixiang Ma, Student Member, IEEE, Hao Cao, and Ping Lin, Member, IEEE Abstract—Synchronization is an important component of a practical communication system. Furthermore, network entry including synchronization is important. Since the detection of primary synchronization signal (PSS) is the first step of network entry in long term evolution (LTE) systems, thus it may be a critical path for practical systems. Therefore, tradeoff between performance and low power consumption and low cost of PSS detection needs to be made carefully. This paper presents a new synchronization method for low power and low cost design. The approach of a 1-bit analog-to-digital converter (ADC) with down-sampling is compared with that of a 10-bit ADC without down-sampling under multi-path fading conditions defined in LTE standard for user equipment (UE) performance test [5]. The simulation results of PSS are obtained on several kinds of chan- nels. The simulation results explicitly show that the performance of the method with down-sampling for 1-bit ADC does not degrade even if frequency offset exists. Based on the simulation results, different implementation architectures and their synthesis report and analysis are present. A low-power low-cost design with high performance to detect PSS is derived in this paper. Index Terms—Low cost, low power, matched filter, primary syn- chronization signal (PSS). I. INTRODUCTION T HE explosive growth of cell phone users and the in- creasing demand for broadband wireless access has led to the development of long term evolution (LTE) to replace the wideband code division multiple access (WCDMA)-based air interface by the 3rd Generation Partnership Project (3GPP). Several minimum requirements of LTE include packet data support with peak data rates of 300 Mbps in the downlink and 75 Mbps in the uplink, a low maximum latency of 10 ms MAC layer round trip delay, and flexible bandwidth scalability. These requirements result to the adoption of orthogonal frequency division multiplexing (OFDM)-based modulation and mul- tiple access, multiple-input-multiple-output (MIMO) antenna schemes, and adaptive modulation and coding with advanced channel coding, space time coding and hybrid automatic repeat request (ARQ) protocols. Synchronization sequence is more important because its de- tection affects not only search time but also performance of demodulation. The 3GPP working group undertakes plenty of rigorous evaluation of different kinds of sequence to enhance Manuscript received August 26, 2010; revised January 20, 2011; accepted April 20, 2011. Date of publication May 31, 2011; date of current version June 01, 2012. The authors are with Beijing Embedded System Key Lab, Beijing Univer- sity of Technology, Beijing 100124, China (e-mail: [email protected]. edu.cn; [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TVLSI.2011.2152866 the performance of search time. Consequently, it was decided to adopt Zadoff-Chu (ZC) sequences as the downlink primary synchronization signal (PSS) and the uplink random access pre- amble. The ZC sequences are a group of general-chirp-like se- quences with good correlation properties [1]. To identify the cell and obtain synchronization, PSS is detected while cell search takes place. Currently used matched filters [2]–[4] are compu- tation-intensive since they require a large number of constant complex multiplications. The main objective of this paper is to propose an efficient and accurate PSS detection method with low power and low cost. The system model, channel model, and PSS definition are introduced in Section II. A brief review of the matched filter approach is presented in Section III. Afterwards, both the method of 1-bit ADC with down-sampling and that of 10-bit ADC without down-sampling for PSS detection are discussed in Section IV whereas their simulation results are shown in Section V. Section VI addresses different implementation architectures of PSS detection. Finally, conclusion remarks are given in Section VII. II. SYSTEM MODEL AND PROBLEM DEFINITION A. OFDM System Model With Carrier Frequency Offset (CFO) 3GPP adopt OFDM to improve spectrum efficiency. In OFDM systems, a sequence of complex data symbols is considered as orthogonal subcarriers during the th OFDM block, the sequence of data symbols is defined as follows: (1) The sequence of data symbols is modulated using an -point inverse discrete Fourier transform (IDFT) process that produces the sequence (2) where is the normalized -by- IDFT matrix and is (3) Consequently, the th sample in the sequence can be ex- pressed as (4) In fading channels, a time-domain guard interval, which is named as cyclic prefix (CP), is created by copying the last samples of the IDFT output and appending them at the begin- ning of the OFDM symbol to be transmitted. So the transmitted OFDM block consists of samples. 1063-8210/$26.00 © 2011 IEEE

Transcript of 05782974.pdf

  • IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 7, JULY 2012 1161

    A Low-Power Low-Cost Design of PrimarySynchronization Signal Detection

    Chixiang Ma, Student Member, IEEE, Hao Cao, and Ping Lin, Member, IEEE

    AbstractSynchronization is an important component of apractical communication system. Furthermore, network entryincluding synchronization is important. Since the detection ofprimary synchronization signal (PSS) is the first step of networkentry in long term evolution (LTE) systems, thus it may be acritical path for practical systems. Therefore, tradeoff betweenperformance and low power consumption and low cost of PSSdetection needs to be made carefully. This paper presents anew synchronization method for low power and low cost design.The approach of a 1-bit analog-to-digital converter (ADC) withdown-sampling is compared with that of a 10-bit ADC withoutdown-sampling under multi-path fading conditions defined inLTE standard for user equipment (UE) performance test [5]. Thesimulation results of PSS are obtained on several kinds of chan-nels. The simulation results explicitly show that the performanceof the method with down-sampling for 1-bit ADC does not degradeeven if frequency offset exists. Based on the simulation results,different implementation architectures and their synthesis reportand analysis are present. A low-power low-cost design with highperformance to detect PSS is derived in this paper.

    Index TermsLow cost, low power, matched filter, primary syn-chronization signal (PSS).

    I. INTRODUCTION

    T HE explosive growth of cell phone users and the in-creasing demand for broadband wireless access has ledto the development of long term evolution (LTE) to replace thewideband code division multiple access (WCDMA)-based airinterface by the 3rd Generation Partnership Project (3GPP).Several minimum requirements of LTE include packet datasupport with peak data rates of 300 Mbps in the downlink and75 Mbps in the uplink, a low maximum latency of 10 ms MAClayer round trip delay, and flexible bandwidth scalability. Theserequirements result to the adoption of orthogonal frequencydivision multiplexing (OFDM)-based modulation and mul-tiple access, multiple-input-multiple-output (MIMO) antennaschemes, and adaptive modulation and coding with advancedchannel coding, space time coding and hybrid automatic repeatrequest (ARQ) protocols.

    Synchronization sequence is more important because its de-tection affects not only search time but also performance ofdemodulation. The 3GPP working group undertakes plenty ofrigorous evaluation of different kinds of sequence to enhance

    Manuscript received August 26, 2010; revised January 20, 2011; acceptedApril 20, 2011. Date of publication May 31, 2011; date of current version June01, 2012.

    The authors are with Beijing Embedded System Key Lab, Beijing Univer-sity of Technology, Beijing 100124, China (e-mail: [email protected]; [email protected]; [email protected]).

    Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

    Digital Object Identifier 10.1109/TVLSI.2011.2152866

    the performance of search time. Consequently, it was decidedto adopt Zadoff-Chu (ZC) sequences as the downlink primarysynchronization signal (PSS) and the uplink random access pre-amble. The ZC sequences are a group of general-chirp-like se-quences with good correlation properties [1]. To identify the celland obtain synchronization, PSS is detected while cell searchtakes place. Currently used matched filters [2][4] are compu-tation-intensive since they require a large number of constantcomplex multiplications.

    The main objective of this paper is to propose an efficientand accurate PSS detection method with low power and lowcost. The system model, channel model, and PSS definitionare introduced in Section II. A brief review of the matchedfilter approach is presented in Section III. Afterwards, both themethod of 1-bit ADC with down-sampling and that of 10-bitADC without down-sampling for PSS detection are discussedin Section IV whereas their simulation results are shown inSection V. Section VI addresses different implementationarchitectures of PSS detection. Finally, conclusion remarks aregiven in Section VII.

    II. SYSTEM MODEL AND PROBLEM DEFINITION

    A. OFDM System Model With Carrier Frequency Offset (CFO)3GPP adopt OFDM to improve spectrum efficiency. In

    OFDM systems, a sequence of complex data symbols isconsidered as orthogonal subcarriers during the th OFDMblock, the sequence of data symbols is defined as follows:

    (1)The sequence of data symbols is modulated using an -pointinverse discrete Fourier transform (IDFT) process that producesthe sequence

    (2)where is the normalized -by- IDFT matrix and is

    (3)Consequently, the th sample in the sequence can be ex-pressed as

    (4)

    In fading channels, a time-domain guard interval, which isnamed as cyclic prefix (CP), is created by copying the lastsamples of the IDFT output and appending them at the begin-ning of the OFDM symbol to be transmitted. So the transmittedOFDM block consists of samples.

    1063-8210/$26.00 2011 IEEE

  • 1162 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 7, JULY 2012

    TABLE IDELAY PROFILES FOR E-UTRA CHANNEL MODELS

    At the receiver side, after removing the first CP samples,the received sequence

    (5)

    is obtained [9]

    (6)

    where represents the normalized CFO, and representsthe effect of the accumulated phase rotation caused by the CFOon the time domain samples

    (7)(8)

    denotes the channel frequency response during the thOFDM block

    (9)

    represents a zero-mean complex white Gaussian noisesample with variance .

    Assuming that the receiver sampling clock is aligned to thatof the transmitter, then the th element of can be expressedas

    (10)

    B. Channel Propagation ModelThe evolved universal terrestrial radio access (E-UTRA)

    channel model is recommended, since PSS is a component ofLTE. It can evaluate the proposed method more reasonable andpractical to use of E-UTRA channel model.

    There are delay profiles, Doppler spectra and channel corre-lation matrices in E-UTRA channel model. Then there are toomany combinations of these components. First of all, delay pro-files are introduced.

    The delay profiles are selected to be representative of low,medium and high delay spread environments. The resultingmodel parameters are defined in Table I and the tapped delayline models are defined in Tables IIIV [5].

    TABLE IIEXTENDED PEDESTRIAN MODEL

    TABLE IIIEXTENDED VEHICULAR MODEL

    TABLE IVEXTENDED TYPICAL URBAN MODEL

    TABLE VCHANNEL MODEL PARAMETERS

    The Doppler spectra are selected to be representative of low,medium and high Doppler spread environments. Then, the maxDoppler frequency is 5, 70, and 300 Hz, respectively.

    The combinations of delay profiles and Doppler spectra arealso defined in [5]. Only five combinations can be used to eval-uate the performance measurements of receiver in multi-pathfading environment. Table V shows all the combinations.

    Finally, the channel correlation matrices are also selectedto be representative of low, medium and high channel correla-tion environments. Since 4-by-4 simulations will be given in

  • MA et al.: A LOW-POWER LOW-COST DESIGN OF PRIMARY SYNCHRONIZATION SIGNAL DETECTION 1163

    TABLE VIDIFFERENT CORRELATION CONSTANT

    TABLE VIIROOT INDICES FOR THE PSS

    next section, thus 4-by-4 correlation matrices is provided asfollows [5]:

    (11)

    where and is define in Table VI [5], and denotes Kro-necker product.

    C. PSS

    A synchronization channel (SCH) is specified in LTE systemto transmit PSS and secondary synchronization signal (SSS) [1].

    The sequence used for the PSS is generated from afrequency-domain ZC sequence [1] according to

    (12)

    where the ZC root sequence index is given by Table VII [1].The three different ZC sequences are orthogonal to each

    other, and each sequence corresponds to a sector identity whichis in the range of 0 to 2. The ZC sequence is chosen for itsgood periodic autocorrelation and cross-correlation properties.In particular, these sequences have a low frequency offsetsensitivity, which is described in [8]. Thus, it is easy to detectPSS during the initial synchronization because the ZC sequencehas the flat frequency domain autocorrelation property and thelow frequency offset sensitivity.

    III. FUNDAMENTAL DETECTION METHOD

    The main function of PSS is to detect the boundary of a framewhere non-coherent detection method has to be used at the re-ceiver since there is no known reference information initially.

    Matched filter is a basic non-coherent detection method thatcan be used to detect PSS efficiently.

    The sequence in (12) is mapped to the subcarriers around DCand transformed to time domain by 64-point IDFT. To detectthis signal at the receiver, the correlation with the time-domain signal of the ZC sequence is calculated [2][4]

    (13)

    where is the successive 64-by-1 received signal vector, isthe DFT matrix, and is 64-by-1 vector composed ofpunctured at DC.

    Then, from (13), the coefficients of the matched filter can beobtained

    (14)

    where

    (15)

    and the matched filter can be expressed

    (16)

    where is the received signal.

    IV. PRACTICAL DETECTION METHOD

    From the power consumption perspective, a 10-bit analog-to-digital converter (ADC) uses more power than a 1-bit ADC sincethe 10-bit pipelined ADC has several power amplifiers in it.Typically, the power consumption of a 1-bit 122.88 MHz ADCcomposed of one comparator is about 200 W, while the powerconsumption of a 10-bit 122.88 MHz pipelined ADC is about50 mW. To come up with a low-power solution, a method ofPSS detection using 1-bit ADC is proposed.

    PSS is transmitted periodically, twice per frame which lasts10 ms. The sampling rate of the receiver is 122.88 MHz; how-ever, the date rate of input data to the matched filter is 1.92 MHz.Thus, 9600 samples at the output of the matched filter needto be buffered during the 5 ms period, which is not area andcost efficient. To come up with a low cost solution, a method ofdown-sampling by 8 is used at the output of matched filter.

    A. Method Without Down-Sampling by 8 for 10-Bit ADCFrom the last section, the matched filter as expressed in (16)

    can be reformulated when using a 10-bit, 122.88 MHz pipelinedADC

    (17)

    where is the received signal sampled by a 10-bit, 122.88MHz pipelined ADC, and is obtained in (14) and (15).

    Every output of the matched filter is buffered sincethere is no down-sampling module, and it needs a large areabuffer which is very costly.

  • 1164 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 7, JULY 2012

    TABLE VIIISIMULATION ASSUMPTIONS

    B. Method With Down-Sampling by 8 for 1-Bit ADCEquation (16) can be reformulated when using a 1-bit, 122.88

    MHz ADC

    (18)

    where is the received signal sampled by a 1-bit, 122.88MHz ADC, and is obtained in (14) and (15).

    Every output of the matched filter is down-sampled by8

    (19)

    where is the output of the down-sampling module.Now, only 1200 outputs need to be buffered during 5 ms with

    an additional comparator of 1 out of 8 implementing the down-sampling module. This results in less area which translates tolower cost in a practical system.

    With the practical method introduced above, its implementa-tion architecture is discussed in Section VI after the simulationperformance is discussed in Section V.

    V. SIMULATION RESULTSPrimary synchronous signal is designed for cell search and

    handover in 3GPP LTE systems, which is transmitted every 5ms. Search time of PSS detection is an important criterion whenmeasuring its performance.

    To compare the performance using a 10-bit 122.88 MHz ADCwithout down-sampling and that using a 1-bit 122.88 MHz ADCwith down-sampling by 8, the parameters listed in Table VIII areused in the simulation.

    We assume that there are four receive antennas and fourtransmit antennas in the simulated LTE MIMO system.Replica-based method is very useful for symbol timing de-tection since a diversity gain of 3 dB can be obtained whentwo PSSs are received in different time slot. Higher diversitygain can be achieved when more than two PSSs are used in thedetection. At most 16 PSSs are transmitted in the simulation,that is, the detection gives up after 16 PSS correlations arecalculated at the receiver. From Section II, we know that thereare different combinations of delay profiles, Doppler spectraand channel correlation matrices defined in E-UTRA channelmodel. To demonstrate different simulation assumptions in thechannel model, we simulate both the original method and the

    Fig. 1. Performance of both methods using low correlation channel matrix andEPA 5 Hz channel model.

    proposed method under three typical channel combinations,which is EPA 5 Hz model with low correlation channel matrix,EVA 70 Hz model with medium correlation channel matrix andETU 300 Hz model with high correlation channel matrix.

    Fig. 1 shows the search time of both methods under EPA 5Hz model with low correlation channel matrix, which indicatesthat their performance is very close to each other. The resultsfor EVA 70 Hz and ETU 300 Hz are shown in Figs. 2 and 3,respectively. From Figs. 2 and 3, we can see that the methodusing a 1-bit 122.88 MHz ADC with down-sampling by 8 doesnot degrade the performance much even if the signal-to-noiseratio (SNR) is very low. When SNR is larger than 10 dB, theirperformance are almost identical. As a result, the method of1-bit 122.88 MHz ADC with down-sampling by 8 is proposedas the low power and low cost design for PSS detection withgood search performance.

    VI. HARDWARE IMPLEMENTATION

    As discussed in the previous section, the performance of theproposed method for PSS detection is acceptable in a practicalLTE system; thus, its implementation detail is described in thissection where the matched filter is considered first followed bythe architecture of our proposed PSS detector.

    A. Architecture of Matched FilterThe matched filter is an important component in the PSS

    detection, as denoted in Section III. We use 64-tap time do-main matched filter; hence 64 complex multiplication units permatched filter are used in the calculation in (16).

    Since 84 matched filters are required in the system, a totalof 5376 units of complex multiplication is needed, which is notreasonable for a practical implementation due to the high costof multiplication unit in the receiver. In practice, the samplingrate of input data to the matched filter is 1.92 MHz while thesystem clock is 122.88 MHz, which implies that we can use

  • MA et al.: A LOW-POWER LOW-COST DESIGN OF PRIMARY SYNCHRONIZATION SIGNAL DETECTION 1165

    Fig. 2. Performance of both methods using medium correlation channel matrixand EVA 70 Hz channel model.

    Fig. 3. Performance of both methods using high correlation channel matrix andETU 300 Hz channel model.

    only one complex multiplication unit during 64 cycles instead ofusing 64 units of complex multiplication. Thus we propose thestructure of matched filter shown in Fig. 4. As a result, 84 unitsof complex multiplication are enough for the whole system.

    B. Architecture of PSS DetectionA mismatch of up to 14 part per million (ppm) can exist be-

    tween the oscillators at the base station (eNodeB) and at the UE,so seven groups of matched filters are used to cover the rangeof [ 14 ppm, 14 ppm] with each group responsible for 4 ppmcorresponding to 2/3 subcarrier spacing. Each group containsthree matched filters to detect three different physical-layer IDsof value 0, 1, or 2. Therefore, there are 21 hardware units asshown in Fig. 6 for each receiver antenna. Since the system isMIMO 4-by-4 and there are 4 receiver antennas at the UE end,

    Fig. 4. Matched filter architecture with one complex multiply unit.

    Fig. 5. Original architecture of the whole PSS detection.

    Fig. 6. Area-efficient architecture of the whole PSS detection.

    84 such hardware units are involved in the architecture of thePSS detection.

    As the sampling rate of the input data to PSS detection is 1.92MHz and the PSS signal is repetitively transmitted every 5 ms,there are 9600 samples during 5 ms and thus a single port RAMwith 9600 addresses is needed.

    As described above, there are 84 such RAMs in the system,and the area is too large for the UE chip; therefore an area-efficient architecture is proposed as shown in Fig. 6. Comparedto the architecture in Fig. 5, a small RAM with 8 addresses isadded whose function is to find the maximum value of everyeight correlations. As a result, only 1200 correlation values needto be stored in RAM with 1200 addresses, which reduce theRAM size of the whole system by a factor of almost 8.

    As the implementation is targeted for application-specific in-tegrated circuit (ASIC) chip, the two different architectures aresynthesized using SMIC 65-nm technology at 1.2 V voltage.The area and power reports are listed in Table IX. We can ob-serve that the area of the area-efficient architecture is muchsmaller than that of the original architecture, which reduces the

  • 1166 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 7, JULY 2012

    TABLE IXCOMPARISONS OF TWO ARCHITECTURES

    cost of the chip significantly. From the power perspective, notonly the 1-bit ADC reduces the power consumption, but thehardware of digital logic also does.

    VII. CONCLUSIONIn this paper, we address the problem of detecting primary

    synchronization signal in 3GPP LTE system. Down-samplingblock, 10-bit 122.88 MHz ADC and 1-bit 122.88 MHz ADC arebasic components of PSS detection methods. Theoretically, de-tection with 1-bit ADC and with down-sampling would degradethe performance and prolong the detection time. However, dueto the inherent advantage of the ZC sequence, our simulation re-sults show that the performance of the proposed method usinga 1-bit ADC with down-sampling by 8 does not degrade muchcompared with that using a 10-bit ADC without down-samplingin the presence of frequency offset under several typical LTEpropagation channels. Subsequently, two different implementa-tion architectures of the PSS detection are presented. As the areaand the power consumption of the original implementation ar-chitecture are too large to be acceptable, based on simulationresults and ASIC synthesis results, a more practical implemen-tation architecture is proposed where the PSS can be detectedefficiently and accurately at a much lower power and lower costwhich renders it feasible in the implementation of a UE chip.

    REFERENCES[1] 3rd Generation Partnership Project (3GPP), Sophia-Antipolis Cedex,

    France, 3GPP TS 36.211 v8.9.0 3rd Generation Partnership Project;Technical Specification Group Radio Access Network; EvolvedUniversal Terrestrial Radio Access (E-UTRA); Physical Channelsand Modulation (Release 8), 3rd Generation Partnership Project, Dec.2009, 3GPP.

    [2] K. Manolakis, D. M. Gutierrez Estevez, V. Jungnickel, X. Wen, andC. Drewes, A closed concept for synchronization and cell search in3GPP LTE systems, in Proc. IEEE Wirel. Commun. Network. Conf.,2009, pp. 16.

    [3] B. M. Popovic and F. Berggren, Primary synchronization signal inE-UTRA, in Proc. IEEE 10th Int. Symp. Spread Spectrum Techn. Appl.(ISSSTA), 2008, pp. 426430.

    [4] P-SCH Sequences, Huawei, Kobe, Japan, 3GPP TSG RAN WG1Tdoc R1-072321, 2007.

    [5] 3GPP TS 36.101 v8.9.0 3rd Generation Partnership Project; TechnicalSpecification Group Radio Access Network; Evolved Universal Ter-restrial Radio Access (E-UTRA); User Equipment (UE) Radio Trans-mission and Reception (Release 8), 3rd Generation Partnership Project,Tech. Rep., Dec. 2009, 3GPP.

    [6] G. Colavolpe and R. Raheli, Noncoherent sequence detection, IEEETrans. Commun., vol. 47, no. 9, pp. 13761385, Sep. 1999.

    [7] G. L. Stuiber, Principles of Mobile Communication, 2nd ed. Norwell,MA: Kluwer, 2001.

    [8] S. Sesia, I. Toufik, and M. Baker, LTE-The UMTS Long Term Evolu-tion: From Theory to Practice. New York: Wiley, 2009.

    [9] Y. Yao and G. B. Giannakis, Blind carrier frequency offset estima-tion in SISO, MIMO and multiuser OFDM systems, IEEE Trans.Commun., vol. 53, no. 1, pp. 173183, Jan. 2005.

    Chixiang Ma (S10) received the B.S. degree inelectrical engineering from Zhejiang University,Hangzhou, China, in 2005. He is currently pursuingthe Ph.D. degree from Beijing Embedded SystemKey Lab of Beijing University of Technology,Beijing, China.

    His research interests include MIMO and OFDMof wireless communication systems and VLSIdesign.

    Hao Cao received the B.S. degree in electrical andinformation from Huazhong University of Scienceand Technology, Wuhan, China, in 2008. He iscurrently pursuing the M.S. degree from BeijingEmbedded System Key Lab, Beijing University ofTechnology, Beijing, China.

    His research interests include synchronization andVLSI design.

    Ping Lin (M10) received the M.S. degree in elec-trical engineering from University of Rhode Island,Kingston.

    She is the Director of Beijing Embedded SystemKey Lab, Beijing University of Technology, Beijing,China. Her research interests include DSP algo-rithms, VLSI design, wireless communications, andembedded SOC.