FPGA BasedCommunications Receivers for Smart Antenna Siriteanu Blostein Millar EURASIP

download FPGA BasedCommunications Receivers for Smart Antenna Siriteanu Blostein Millar EURASIP

of 13

Transcript of FPGA BasedCommunications Receivers for Smart Antenna Siriteanu Blostein Millar EURASIP

  • 8/13/2019 FPGA BasedCommunications Receivers for Smart Antenna Siriteanu Blostein Millar EURASIP

    1/13

    Hindawi Publishing CorporationEURASIP Journal on Embedded SystemsVolume 2006, Article ID 81309, Pages113DOI 10.1155/ES/2006/81309

    FPGA-Based Communications Receivers for Smart AntennaArray Embedded Systems

    Constantin Siriteanu,1, 2 Steven D. Blostein,1 and James Millar3

    1 Department of Electrical and Computer Engineering, Queens University, Kingston, ON, Canada K7L 3N62 Communications Signal Processing Laboratory, Department of Electrical and Computer Engineering,

    Hanyang University, Seoul, Korea3 CMC Microsystems, Kingston, ON, Canada K7L 3N6

    Received 15 December 2005; Revised 7 May 2006; Accepted 2 June 2006

    Field-programmable gate arrays (FPGAs) are drawing ever increasing interest from designers of embedded wireless communica-tions systems. They outpace digital signal processors (DSPs), through hardware execution of a wide range of parallelizable commu-nications transceiver algorithms, at a fraction of the design and implementation effort and cost required for application-specificintegrated circuits (ASICs). In our study, we employ an Altera Stratix FPGA development board, along with the DSP Buildersoftware tool which acts as a high-level interface to the powerful Quartus II environment. We compare single- and multibranchFPGA-based receiver designs in terms of error rate performance and power consumption. We exploit FPGA operational flexibilityand algorithm parallelism to design eigenmode-monitoring receivers that can adapt to variations in wireless channel statistics, forhigh-performing, inexpensive, smart antenna array embedded systems.

    Copyright 2006 Constantin Siriteanu et al. This is an open access article distributed under the Creative Commons AttributionLicense, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properlycited.

    1. INTRODUCTION

    Conventional wireless communications systems employ asingle receiving antenna. Enhanced, antenna array receiversemploying beamforming (BF) and maximal-ratio combin-ing (MRC) can generate antenna and diversity gain, that is,increased average and instantaneous (with respect to chan-nel fading) receiver signal-to-noise ratio (SNR) [14]. Al-though beneficial in terms of performance, these enhanced,multibranch algorithms can require much larger compu-tational volumes than the conventional, single-branch re-

    ceiver. Recent analytical and simulation studies [14] ofa hybrid algorithm entitled maximal-ratio eigencombining(MREC) claimed efficient performance-complexity tradeoffsfor smart antenna arrays.

    Receiver algorithms have traditionally been deployed ongeneral-purpose, sequential, digital signal processors (DSPs),or on application-specific integrated circuits (ASICs). En-hanced receiver algorithms, which are generally highly par-allelizable, and higher data transmission rates can burdenDSPs beyond their capacity for real-time processing. Time-critical, highly parallelizable applications are common in ar-eas ranging from modern communications [57] to image[6] and speech [8] processing, and even bioinformatics [9].

    ASICs are hardwired for specific tasks. Although fast (some-times several orders of magnitude faster than DSPs, throughhardware parallelism) and power-efficient, implemented de-signs are inflexible [7]. More importantly, ASIC design andproduction are time-consuming and extremely expensive forchips produced in small numbers, due to very high non-recurring engineering cost.

    Unlike ASICs, field-programmable gated arrays (FPGAs)are reconfigurable, that is, their internal structure is onlypartially fixed at fabrication, leaving to the application de-signer the wiring of the internal logic for the intended

    task. This can significantly shorten design and production,and thus time to market, for FPGA-based embedded sys-tems. Although FPGAs tend to be slower and to consumemore power than ASICs [7], FPGA reconfigurability canbenefit platform longevity (which is extremely importantin an era of fast-changing wireless communications stan-dards) by allowing design changes/upgrades even in sys-tems already in operation. This flexibility can be effectivelyexploited for rapid prototyping of advanced communica-tions signal processing, such as Bell Labs Layered Space-Time(BLAST) multi-input multi-output (MIMO) architecture forthird-generation Universal Mobile Telecommunications Sys-tem (UMTS) [5]. Furthermore, an FPGA can, for example,

  • 8/13/2019 FPGA BasedCommunications Receivers for Smart Antenna Siriteanu Blostein Millar EURASIP

    2/13

    2 EURASIP Journal on Embedded Systems

    implement MRC branches either sequentially, or in paral-lel, or anywhere in between, depending on required speed,available chip resources, and power constraints. FPGA-basedimplementations concurrently operating several hardwaremodules can outpace many times their processor-basedcounterparts [6, 9]. An insightful DSP, FPGA, and ASIC

    implementation comparison for a four-antenna orthogo-nal frequency-division multiplexing (OFDM) receiver can befound in [7].

    FPGAs are especially well suited for embedded systems(e.g., cellular system base station line cards, or mobile sta-tions) because, beside an area of reconfigurable logical ele-ments, they can also incorporate large amounts of memory,high-speed DSP blocks, clock management circuitry, high-speed input/output (I/O), as well as support for externalmemory, and high-speed networking and communicationsbus standards. For a small share of the resources, processorscan be included within the FPGA fabric as well[9].

    Power consumed in embedded systems is, in general,

    strictly limited. Otherwise, line-powered designs would re-quire special and/or expensive power sources and heat sinksor may not operate reliably, while portable devices wouldquickly deplete the battery[10, 11]. Although FPGA chipsare judiciously manufactured for power efficiency, applica-tion designers also need to carefully consider this issue be-cause a consistently underutilized design wastes static anddynamic powers [1013].

    The objective of this paper is to investigate FPGA suit-ability for efficient smart antenna array embedded receivers.In the process, we overview an Altera FPGA-based designenvironment, and implement conventional and enhanced(BF, MRC, MREC) receiver algorithms. It is demonstrated

    that FPGA implementations of eigenmode-based combiningadapted to the slow variations in channel statistics can yieldnear-optimum bit error rate (BER) performance, for afford-able power budgets.

    The paper is organized as followsSection 2presents thereceived signal model, and overviews BF, MRC, and MREC.Section 3 describes the Altera software and hardware em-ployed to design, simulate, analyze, and implement these re-ceiver algorithms. Comparative performance and cost resultsare provided inSection 4.

    2. SIGNAL MODEL AND COMBINING METHODS

    2.1. Received-signal model

    Consider a source transmitting a BPSK signal through afrequency-flat Rayleigh fading channel, and anL-element re-ceiving antenna array. After demodulation, matched filter-ing, and symbol-rate sampling, the complex-valued receivedsignal vector is given by [4]

    y= Esbh+n, (1)where dependence on the sampling time is not explicit, to

    simplify notation. The L elementsyi,i = 1 : L 1, . . . , L,of the received signal vector

    y = [

    y1

    y2

    yL]T are called

    branches, and the elements

    hi,i = 1 : L, of the channel vec-

    torh = [h1h2 hL]T, are called channel gains. In (1), Esis the energy transmitted per symbol, andb is the transmit-ted BPSK symbol, with |b|2 = 1 (b = 1 for transmitted bit0,b= 1 for transmitted bit 1). We assume that the channel

    vectorhand the noise vectornare complex-valued, mutually

    independent, zero-mean Gaussian, with

    h CN(0, R

    h) and

    n CN (0, N0IL), respectively. Further assumptions are thatchannel fading[14] is frequency-flat with unit variance oneach branch, the noise is temporally white, and the receivedsignal is interference-free. This signal model is simple, yetsufficient for basic performance evaluations [15]. Current-standard wireless communications signaling is beyond thescope of this work.

    2.2. Azimuth angle spread model

    Due to radio-wave scattering, transmitted signals are re-ceived with azimuthal dispersion[14, 16]. Without loss of

    generality, numerical results presented herein assume trun-cated Laplacian power azimuth spectrum (p.a.s.) [4] becauseit accurately models empirical results [16]. The p.a.s. rootsecond central moment is denoted as azimuth spread(AS)[16]. Analytical expressions for the elements ofRh, obtainedthrough straightforward calculations in [4] for a uniform lin-ear array (ULA), indicate that antenna correlation (and thusreceiver BER performance [1,2]) is a function of p.a.s. type,azimuth spread, average angle of arrival (which is assumedto be zero with respect to the broadside, for all the resultsshown later), and normalized interelement distance dn (i.e.,the ratio between the physical interelement distance and halfof the carrier wavelength).

    The azimuth spread depends on the environment and an-tenna array location/height, and is variable [16]. Radio chan-nel measurements for sub/urban scenarios [16] showed thatbase station azimuth spread is well modeled as a log-normalrandom variable [16, equation (9)]. For typical urban sce-narios [16, Table I], these measurements found that base-station azimuth spread correlation decreases exponentiallywith the distance traveled by the mobile [16, equation (14)].The azimuth spread decorrelation distance, that is, the dis-tance over which the azimuth spread correlation decreasesby a factor of two, was determined asdAS=50 m[16]. Com-paringdAS with the fading coherence distance [17, equation(4.40.b)]dc computed for the typical system parameter val-ues fromTable 1,we conclude that the azimuth spread vari-ation is much slower (by about 3 orders of magnitude) thanthe fading. Furthermore, for this typical urban scenario, itwas found in [16] that Pr(1 < AS < 20) 0.8, that is,azimuth spread is small to moderate, producing significant(greater than 0.5) correlations between adjacent elements ofa compact ULA, for example,dn =1 [1,3].

    2.3. MRC and BF

    For perfectly known channel (p.k.c.), the optimum (maxi-mum-likelihood) receiver linearly combines the receivedsignal vector with the channel vector, that is, it computes

  • 8/13/2019 FPGA BasedCommunications Receivers for Smart Antenna Siriteanu Blostein Millar EURASIP

    3/13

    Constantin Siriteanu et al. 3

    Table 1: Mobile, channel, and receiver (channel estimation) pa-rameters.

    Parameter Value

    Mobile speed v= 60 km/h

    Transmitted BPSK symbol rate fs = 10 ksps

    Carrier frequency fc = 1.8GHzPilot symbol period [18, Section III.C] Ms = 7

    Maximum Doppler frequency fD = 100Hz

    Normalized maximum Doppler frequency fm =fD/ fs = 0.01

    Channel coherence time [17, equation (4.40.b)] Tc 1.8 ms

    Channel coherence distance dc = v,

    Tc 30mm

    Interpolator size [18, Section III.D] T =11

    hHy, and then detects the BPSK symbol asb= sign hHy. (2)

    This approach is also known as maximal-ratio combining(MRC) [19] because it maximizes the SNR (instantaneous,i.e., conditioned on the channel gains) at the combiners out-put. MRC with L = 1 reduces to the conventional, single-branch, receiver.

    In actual systems, with imperfectly known channel(i.k.c.), knowledge of the channel gains is acquired throughestimation [1,18]. The received symbol can then be detected

    asb = sign{[gHy]}, whereg = [g1 g2 gL]T, andgi,i = 1 : L, are the channel gain estimates. This combiningapproach has often been employed and studied[1,3,15,19],although it is suboptimal (when the channel gains are notindependent and identically distributednon-i.i.d.)[3].

    MRC is known to provide full diversity gain [19]thatis, the greatest performance improvement, averaging overfading and noise, compared to a single-branch systemfori.i.d. branches. This requires either widely spaced elements,which are unfeasible for pocketsize mobile stations, or richscattering, which is unlikely at base stations [16].

    For narrow azimuth spread, received signals are highlycorrelated [1,2] and the received signal energy, proportional

    to tr(Rh) Li=1(Rh)i,i = Li=1i, wherei,i = 1 :L, are theeigenvalues ofRh, is concentrated within the first few eigen-modes. Then, the channel is said to be spatially nonselective,and the available diversity gain is small [2022]. Enhancedperformance can then be obtained by taking advantage of an-tenna gain using maximum average SNR beamforming (BF),that is, by combining the received signal vector with the dom-inant eigenvector ofRh[14]. Increasing azimuth spread de-creases antenna correlation, that is, the channel becomes spa-tially more selective and higher diversity gain becomes avail-able [14]. In subsequent sections, we show how to exploitavailable antenna and diversity gains within complexity andpower constraints.

    2.4. Eigencombining method

    BF has traditionally been applied in scenarios with very smallazimuth spread. Otherwise, MRC has been employed. How-ever, it was recently claimed that a unifying approach, calledmaximal-ratio eigencombining (MREC), and described

    below, can adapt to channel correlation (i.e., azimuth spread)variation [14, 20]. Our analytical and simulation resultshave shown that MREC may thus outperform MRC and BFin terms of BER performance and complexity [14].

    The channel correlation matrixRh has real nonnegativeeigenvalues1 2 L 0, orthonormal eigenvec-torsei,i = 1 : L, and can be decomposed as Rh = ELLEHL,where L diag{i}

    Li=1 is a diagonal matrix, and EL

    [e1e2 eL] is a unitary matrix. Hereafter, Rh,L, and ELareassumed perfectly known because, in practice, enough inde-pendent channel samples would be available for an accurateestimation. Actual MREC could employ computationally in-significant low-rate eigenstructure updating [20].

    MREC of orderNconsists of the following steps[14]:(i) Karhunen-Loeve transformation (KLT) [22] of the re-

    ceived signal vector from (1) with the full-column rank

    matrixEN[e1e2 eN]; the elements of the trans-

    formed signal vector,y = EHNy = EsbEHNh+ EHNn =Esbh+n, are denoted aseigenbranches;

    (ii) MRC of theNeigenbranches.

    The components of the transformed channel gain vectorh =

    EHNh are further referred to as channel eigengains. They aremutually uncorrelated, with zero mean, and variances2hi

    E{|hi|2} = i, that is,Rh E{hhH} = N = diag{i}Ni=1,

    for any channel gain distribution [21]. From the initial as-sumptions on fading and noise we obtainh CN(0,N),andn =EHNn CN(0, N0IN), so that the eigengains are in-dependent, which supports straightforward MREC analysis[14].

    Of all possible transforms, the KLT packs the largestamount of energy from the original, L-dimensional signalvectory into the transformed,N-dimensional signal vectory[22], which is desirable for dimension (i.e., complexity) re-duction. Note also that MREC of order N =1 represents infact BF, while it can be shown that full-MREC, that is, MRECof orderN =L, is equivalent to MRC [14].

    2.5. Order selection for MREC

    A simple criterion for optimal MREC order selection is [21]

    minN=1:L

    Es

    Li=N+1

    i+N0N

    , (3)

    better known as the bias-variance tradeoff criterion [3, 4](BVTC) because (3) balances the loss incurred by remov-ing the weakest (L N) intended-signal contributions (thefirst term) against the residual-noise contribution (the sec-ond term). Computer evaluations found the BVTC effec-tive for MREC adaptation to channel conditions [3,4]. Note

  • 8/13/2019 FPGA BasedCommunications Receivers for Smart Antenna Siriteanu Blostein Millar EURASIP

    4/13

    4 EURASIP Journal on Embedded Systems

    Native blocks(floating point)

    Signal processing

    Communications

    Channels, noise

    Simulink

    DSP builder blocks(fixed point)

    Signal compiler

    Arithmetic HIL

    Gate, control, rate change

    Storage, I/O, bus

    IBM PC

    Compiler, simulator

    Synthesizer & fitter

    Timing analyzer

    Powerplay analyzer

    Chip programmer

    Quartus IIMATLAB

    Altera stratix EP1S80B956C6 FPGA

    - Process: 1.5 V, 0.13 m, SRAM

    - Chip pins: 956- Programmable: 79040 logic elements

    - DSP blocks: up to 176, 9 9 bit, embedded multipliers

    - Clocking: up to 16 global clocks; 12 real-time reconfigurable PLLs- Memory: approx. 7.5 Mb RAM- Interfaces: DDR/SDR DRAM, rapidIO, ethernet, PCI

    Altera stratix EP1S80 DSP development board

    Figure1: FPGA development system hardware/software diagram.

    however that since BVTC disregards the MREC complexity,it can overload limited resources.

    A different MREC adaptation criterion is described next.Assume that signals received (independently) fromNu mo-bile stations require processing at a base station with onlyNe NuLavailable eigenbranch processing modules. Then,a control algorithm determines the largest (dominant)Neeigenmodes among all transmitting mobiles, and allocates

    available resources accordingly. For instance, if a receivingantenna array system withL = 4 elements has onlyNe = 3available eigenbranch processing modules whileNu =2, theavailable resources are allocated as follows: if the 3 largesteigenvalues (out ofNuL = 8) are such that two correspondto User 1, and one to User 2, then two eigenbranch process-ing modules are allocated to process the received signal vec-tor from User 1, and the other available eigenbranch is al-located to User 2. This approach to selecting eigenbranchesfor MREC is hereafter denoted as theeigenvalue-based trade-offcriterion(EVTC), while MREC adapted based on EVTC isreferred to as EVTC MREC.

    2.6. Channel estimation using pilot-symbol-aidedmodulation (PSAM)

    In PSAM, the transmitter periodically inserts known pilotsymbolsbpof energyEp(=Esfor results shown herein), intothe information-encoding symbol stream, and the receiverinterpolates the pilot samples acquired across several slotsto estimate the channel during data symbols [14,18]. Thenotation (t, m) is used below to denote temporal indexing,wheret= T1: T2is the time slot index, andm= 0 :Ms1is the symbol index within the slot of lengthMs. Heret =0refers to the slot in which estimation takes place,m = 0 cor-responds to pilot symbols, andm =1 : Ms 1 corresponds

    to data-encoding symbols;T =T1+T2+ 1 slots (in general,T1 = T2) are used for interpolation.

    The estimate of theith eigengain at themth data symbolposition in the current slot can be written as

    gi(0, m)= vHi (m)ri, (4)

    wherevi(m) is the interpolation filter and

    ri 1Epbp

    yiT1, 0, . . . ,yiT2, 0T (5)contains the samples taken during pilot symbols.

    The interpolation filter chosen for the numerical resultsshown later is the filter with brick-wall-type frequency re-sponse, which is optimum in the absence of noise; we willrefer to this filter, with impulse-response tapered by a raised-cosine window [1,2], as the SINC filter, and the correspond-ing estimation approach as SINC PSAM. The interpolatorcoefficients, given by

    v(m)t+T1+1 = sincm

    M

    t cos[(m/M t)]

    1

    [2(m/M

    t)]2

    , (6)

    enter the FPGA-based receiver designs fromSection 4.Notethat channel estimation is among the most demanding re-ceiver functions resource-wise[5].

    3. FPGA HARDWARE AND SOFTWARE

    3.1. FPGA system description

    CMC Microsystems provided the system shown inFigure 1.The Altera DSP Development Kit Stratix ProfessionalEdition, which comprises the Stratix EP1S80 DSP develop-ment board, is built around the Stratix EP1S80B956C6 FPGA

  • 8/13/2019 FPGA BasedCommunications Receivers for Smart Antenna Siriteanu Blostein Millar EURASIP

    5/13

  • 8/13/2019 FPGA BasedCommunications Receivers for Smart Antenna Siriteanu Blostein Millar EURASIP

    6/13

  • 8/13/2019 FPGA BasedCommunications Receivers for Smart Antenna Siriteanu Blostein Millar EURASIP

    7/13

  • 8/13/2019 FPGA BasedCommunications Receivers for Smart Antenna Siriteanu Blostein Millar EURASIP

    8/13

    8 EURASIP Journal on Embedded Systems

    MATLAB functions and scr ipts/native Simulink blocks (floating point)

    Transmitter

    Data sourceSlot generator, PSAM;

    Ms = 7 : p =0; d=random 0/1d

    pd d d d d dp d d d d d d

    BPSK modulatorinput =0; b= +1input =1; b= 1 E

    1/2b

    Channel

    Azimuth spreadgenerator

    Rh E

    Fadingfm = 0.01

    hNoise n

    i N0 Oscillator/PLL clk

    BVTC MRECorder selector

    MREC adaptation

    NClock gating emulation

    ifi N, clki = clk

    ifi > N, clki = inactive

    clki

    ei

    KLT

    clki

    eHiyy

    Delay and storage

    ri = [yi( T1,0), . . . ,yi(+T2,0)]/(E1/2p bp )

    clkiyi

    Interpolation

    gi = vHriChannel estimation

    Receiver

    DSP builder Simulink blocks (fixed point)

    ( )

    g1 y1

    gi yi

    gNyN

    Data sink

    BPSK demodulator

    b= Re( );b > 0, output 0b < 0, output 1

    Figure4: Transmitter, channel, and FPGA-based BVTC MREC receiver diagram.

    for the resource and power usage report. Note that a stand-

    alone BF implementation takes about as many resources asorder-1 MREC takes in the BVTC MREC implementationsince these two designs are almost identical. Furthermore,MRC can be obtained from an MREC design by bypass-ing the KLT. Thus, an MREC design can easily be recon-figured (even during operation, on the fly) to implementBF or MRC instead. Implementation details are provided inFigure 4, for the case when the receiver implements BVTCadaptive MREC.

    For resource/power usage and performance evaluation,we model a typical urban scenario for realistic channel con-ditions from the base station perspective [16], and applythe conventional and enhanced receiver combining algo-

    rithms (after estimating channel gains and eigengains as inSection 2.6) to detect the transmitted symbols. Using MAT-LAB and Simulink, the actual log-normal distributed, time-correlated azimuth spread is simulated and then employedto compute the spatial correlation matrix, for realistic Lapla-cian power azimuth spectrum (p.a.s.) [16]seeFigure 4. Inan actual embedded receiver, the channel correlation matrixand its eigenvalue decomposition could be updated by a pro-cessor (e.g., Alteras soft-core FPGA-based Nios II). We se-lected a correlation update period of 0.14 second (denotedfurther as a frame, corresponding to a distance of roughly2.3 m traveled by the mobile) since the azimuth spread re-mains relatively constant over this interval [16], providing

    the processor with sufficient time and uncorrelated samples

    for eigenstructure updating [3,4]. The computed correlationmatrixRhinputs a customized Simulink multipath Rayleighfading channel block to simulateL= 4 correlated branches.

    The top subplot inFigure 5depicts an azimuth spread se-quence generated using the model described inSection 2.2.The predominantly small-to-moderate azimuth spread val-ues indicate that we should often expect significant spatialcorrelation [1,3], that is, small available diversity gain. Per-formance enhancement can then arise from BF antenna gain.Occasionally however, the azimuth spread can also becomefairly large, but then the available diversity gain cannot ben-efit BF performance. On the other hand, significant diver-sity gain may be available too infrequently to justify perma-

    nent use of an MRC receiver. As we will see, an FPGA-basedMREC receiver can provide, for a channel with slowly vary-ing statistics, flexibility that yields affordable performance.

    The main benefit of an FPGA-based BVTC adaptiveMREC receiver is that unnecessary eigenbranches can bevirtually turned offusing the clock gating technique [12] toreduce dynamic power loss, while necessary eigenbranchescan be implemented to run in parallel, for high speed. Ex-empting weak eigenbranches can also benefit performance[1]. Furthermore, as mentioned earlier, an MREC imple-mentation can easily be reduced to standalone BF or MRCimplementations, if required, either at system setup or dur-ing operation.

  • 8/13/2019 FPGA BasedCommunications Receivers for Smart Antenna Siriteanu Blostein Millar EURASIP

    9/13

    Constantin Siriteanu et al. 9

    Typical urban scenario:v= 60km/h,dAS= 50 m30

    20

    10

    0Azimuthspread

    (degrees)

    0 50 100 150 200

    Distance (m)

    ULA:L= 4,dn = 1;Es/N0 = 5 dB; fm = 0.01;

    SINC PSAM;Ms = 7;T= 11

    4321

    N

    fromBVTC

    0 2 4 6 8 10 12 14

    Time (s)

    0.15

    0.1

    0.05

    0AverageBER

    MRC,L= 1 BF BVTC MREC MRC,L= 4

    Figure 5: Azimuth spread, MREC order selected with the BVTC,and BER performance (averaging over trial) for BF, MRC, andBVTC MREC.

    Altera documentation states that clock gating is avail-able only through lower-level (Quartus II) design. Therefore,clock gating was only emulated in DSP Builder, for the BVTCMREC implementation shown inFigure 4.First, nonadap-tive MREC designs withN =1 : 4 eigenbranches were com-piled to determine their resource usage (shown inTable 2).

    Then, after each eigenstructure update during the BVTCMREC simulation, we stored the selected MREC orders anddisconnected unused eigenbranches from the active struc-ture. Finally, average resource usage was computed.Figure 5shows in the middle subplot the MREC order selected adap-tively using the BVTC, and in the lower subplot the BER av-eraged over the trial. Notice that forL = 4, MRC and BVTCadaptive MREC slightly outperform BF, and greatly outper-form the single-branch receiver.

    For the same typical urban scenario and system param-eters,Figure 6shows resource usage, in percentage points ofthe total available, and dynamic power consumption, aver-aged over 8 trials. In each trial, the azimuth spread sam-

    ples are correlated, as described inSection 2.2, but the az-imuth spread sequences are independent between trials. Notethat BF and BVTC MREC require a significantly smallershare of the FPGA programmable fabric, that is, LEs, com-pared to MRC (for L = 4), but more dedicated DSPblocks, due to KLT. The upper-right subplot appears to im-ply more chip pins demand for BF and MREC, because aMATLAB/Simulink-computed eigenvector matrixENinputsthe FPGA. Nevertheless, eigenstructure updating is possiblewith a soft processor, from within the FPGA.

    Figure 7shows performance and total (dynamic + static)power used by a cellular operators large network of basestations similar to the one described in [11]. The single-

    branch receiver consumes least but performs poorly. For per-formance similar to BF and BVTC MREC, MRC (with L= 4)doubles the dynamic power loss (see alsoFigure 6(d)). Thus,BF and BVTC MREC appear to provide a better tradeoff. Re-call however that a compact ULA withdn =1 is considered.For larger interelement distances (feasible at base stations),

    MREC with more than one eigenbranch can significantlyoutperform BF [4].Note that significant branch correlation can occur even

    at mobile stations, due to limited antenna spacing, so that anFPGA-based BVTC MREC implementation employing clockgating can efficiently achieve near-optimum performance.

    Notice from Figure 5(b) that, frequently, only one ortwo (out of the four implemented) eigenbranches were actu-ally employed for MREC for that particular azimuth spreadsequence. Similar results were obtained in other trials forindependent azimuth spread sequences. This suggests thatadaptive FPGA chip resource allocation among several ac-tive users may significantly increase base station user process-

    ing capacity, or, equivalently, reduce the required number ofFPGA chips per base station, lowering both hardware costand static power losses. A possible path towards such imple-mentations is described next.

    4.3. Enhanced MREC receiver designs: the caseof two users processed per FPGA chip

    EVTC-based adaptive MREC, described inSection 2.5,canprovide more consistent use of the FPGA chip, compared toBVTC MREC. We propose to efficiently exploit a total of 3eigenbranch processing modules, which fit into our FPGA, toprocess concurrently the signals received with L= 4 branches

    from two mobiles (without interference). Rather than per-manently allotting chip processing resources to a certain user(which may or may not need to use them, depending onchannel conditions and required performance), herein wewill adaptively deploy these resources to simultaneously de-tect the symbols transmitted from two mobiles.

    Resource usage information for EVTC MREC whenN =1 : 3 eigenbranches are selected can be found in Table 2.Note that the BVTC and EVTC MREC implementations dif-fer significantly only in the required number of chip pins.The larger number of pins required for EVTC MREC (to in-put the received signals from two mobiles) limits to 3 the pos-sible number of implemented eigenbranches. LargerNeleads

    to unsuccessful compilation. Mutually independent azimuthspread sequences for the signals arriving at the base stationfrom the two mobile stations were simulated, as shown in thetop subplots ofFigure 8.The MREC orders selected with theEVTC for each of the users are shown in the middle subplots.The lower subplots indicate that EVTC MREC can performremarkably close to the enhanced receivers discussed previ-ously.

    Figure 9(a) indicates that our FPGA would not fit con-current four-branch MRC implementations for the twousers. On the other hand, the successfully compiled two-userEVTC MREC implementation with Ne = 3 requires abouthalf of the dynamic power consumed by MRC, for similar

  • 8/13/2019 FPGA BasedCommunications Receivers for Smart Antenna Siriteanu Blostein Millar EURASIP

    10/13

    10 EURASIP Journal on Embedded Systems

    80

    70

    60

    50

    40

    30

    20

    10

    0

    Logicelem

    ents(%)

    MRC, 1 BF BVTC MREC MRC, 4

    50

    40

    30

    20

    10

    0

    Pin

    s(%)

    MRC, 1 BF BVTC MREC MRC, 4

    50

    40

    30

    20

    10

    0

    DSPblocks(%)

    MRC, 1 BF BVTC MREC MRC, 4

    250

    200

    150

    100

    50

    0

    Dynamicpower(mW

    )

    MRC, 1 BF BVTC MREC MRC, 4

    Figure 6: Average resource and dynamic power usage for BF, BVTC MREC, and MRC, over 8 trials with mutually independent azimuth

    spread sequences.

    performance. Furthermore, since EVTC MREC allows for ef-fective concurrent processing of two users on a single FPGA,it yields a twofold reduction in static power consumption ora doubling of the base station user processing capacity. Thus,both implementation and operational costs can be drasticallyreduced with EVTC MREC.

    Ideally, an FPGA-based embedded base station receiverwould comprise: (1) a number of FPGAs programmed forKLT, channel estimation, signal combining, and symbol de-

    tection; (2) an embedded processor monitoring each userschannel conditions (i.e., eigenmodes). At the beginning ofeach frame, the embedded processor browses a user hierar-chy, and allocates the FPGA resources so as to achieve de-sired performance for minimum resource/power consump-tion [3,4]. Thus, it is possible that for a certain period, sev-eral users whose respective received signals are highly corre-lated will share the resources of a single FPGA because noneof them will demand a large number of eigenbranches. Ifthe azimuth spread for one of these users later widens sig-nificantly (yielding more available diversity gain) or if itsSNR degrades (while a certain steady performance level isimposed), a larger share of the FPGA resources can be al-

    located accordingly. An FPGA-based embedded system for aperformance- and a power-aware antenna array receivers canthus be flexibly implemented.

    5. CONCLUSIONS

    We have described and implemented adaptive techniquesthat enhance the performance and reduce the powerconsumption for Altera-FPGA-based embedded wireless

    receivers. We found that smart antenna array receiver algo-rithms, for example, beamforming (BF) and maximal-ratiocombining (MRC), outperform the conventional, single-branch receiver, but the performance gain may not alwaysjustify the additional implementation and operational costs.Tracking the slowly varying dominant channel eigenmodes,and using maximal-ratio eigencombining (MREC) is foundto benefit more than BF and MRC from the parallelismand flexibility of FPGA-based implementation. For simi-lar performance, a twofold increase in user processing ca-pacity or decrease in power consumption is found possi-ble over MRC, for a typical urban scenario and 4 receiv-ing antennas. Adaptive MREC outperforms BF, for slightly

  • 8/13/2019 FPGA BasedCommunications Receivers for Smart Antenna Siriteanu Blostein Millar EURASIP

    11/13

    Constantin Siriteanu et al. 11

    ULA:L= 4,dn = 1;Es/N0 = 5 dB; fm = 0.01;

    SINC PSAM;Ms = 7;T= 11

    0.1

    0.05

    0

    BER

    MRC, 1 BF BVTC MREC MRC, 4

    920900880860840820

    800Total

    powerconsumption

    (kW)

    MRC, 1 BF BVTC MREC M RC, 4Static power loss=781.2 kW

    Figure 7: Average BER and total (static+dynamic) power consumption for BF, BVTC MREC, and MRC, over 8 independent azimuth spreadtrials.

    User 130

    20

    10

    0Azimuthspread(degrees)

    0 50 100 150 200

    Distance (m)

    User 230

    20

    10

    0Azimuthspread(degrees)

    0 50 100 150 200

    Distance (m)

    3

    2

    1N

    fromEVTC

    0 5 10

    Time (s)

    3

    2

    1N

    fromEVTC

    0 5 10

    Time (s)

    0.15

    0.1

    0.05

    0

    BER

    MRC, 1 BF BVTC EVTC MRC, 4

    0.15

    0.1

    0.05

    0

    BER

    MRC, 1 BF BVTC EVTC MRC, 4

    Figure8: Azimuth spread, EVTC MREC order, and average BER performance, for two users.

  • 8/13/2019 FPGA BasedCommunications Receivers for Smart Antenna Siriteanu Blostein Millar EURASIP

    12/13

  • 8/13/2019 FPGA BasedCommunications Receivers for Smart Antenna Siriteanu Blostein Millar EURASIP

    13/13

    Constantin Siriteanu et al. 13

    [4] C. Siriteanu, Maximal-ratio eigen-combining for smarterantenna arrays, Ph.D. dissertation, Queens University,Kingston, ON, Canada, September 2006.

    [5] M. Guillaud, A. Burg, M. Rupp, E. Beck, and S. Das, Rapidprototyping design of a 4 4 BLAST-over-UMTS system, inProceedings of Conference Record of the 35th Asilomar Confer-ence on Signals, Systems and Computers, vol. 2, pp. 12561260,

    Pacific Grove, Calif, USA, November 2001.[6] B. L. Hutchings and B. E. Nelson, Gigaop DSP on FPGA,

    in Proceedings of IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP 01), vol. 2, pp. 885888,Salt Lake City, Utah, USA, May 2001.

    [7] C. Ebeling, C. Fisher, G. Xing, M. Shen, and H. Liu, Imple-menting an OFDM receiver on the RaPiD reconfigurable ar-chitecture, IEEE Transactions on Computers, vol. 53, no. 11,pp. 14361448, 2004.

    [8] F. L. Vargas, R. D. R. Fagundes, and D. Barros Jr., A FPGA-based Viterbi algorithm implementation for speech recogni-tion systems, inProceedings of IEEE International Conferenceon Acoustics, Speech and Signal Processing (ICASSP 01), vol. 2,pp. 12171220, Salt Lake City, Utah, USA, May 2001.

    [9] T. S. T. Mak and K. P. Lam, Embedded computationof maximum-likelihood phylogeny inference using platformFPGA, in Proceedings of IEEE Computational Systems Bioin-

    formatics Conference (CSB 04), pp. 512514, Stanford, Calif,USA, August 2004.

    [10] S. Sharp, Conquering the three challenges of power con-sumption,XCell Journal, no. 53, 2005.

    [11] A. Telikepalli, Performance vs. power: getting the best of bothworlds,XCell Journal, no. 54, 2005.

    [12] L. Benini, G. De Micheli, and E. Macii, Designing low-powercircuits: practical recipes, IEEE Circuits and Systems Maga-zine, vol. 1, no. 1, pp. 625, 2001.

    [13] A. Yang, Design techniques to reduce power consumption,XCell Journal, no. 54, 2005.

    [14] W. C. Jakes, Ed.,Microwave Mobile Communications, John Wi-

    ley & Sons, New York, NY, USA, 1974.[15] J. Proakis, Digital Communications, McGraw-Hill, Boston,

    Mass, USA, 4th edition, 2001.[16] A. Algans, K. I. Pedersen, and P. E. Mogensen, Experimental

    analysis of the joint statistical properties of azimuth spread,delay spread, and shadow fading,IEEE Journal on Selected Ar-eas in Communications, vol. 20, no. 3, pp. 523531, 2002.

    [17] T. S. Rappaport, Wireless Communications. Principles and Prac-tice, Prentice-Hall, Upper Saddle River, NJ, USA, 1996.

    [18] J. K. Cavers, An analysis of pilot symbol assisted modulationfor Rayleigh fading channels, IEEE Transactions on VehicularTechnology, vol. 40, no. 4, pp. 686693, 1991.

    [19] M. K. Simon and M.-S. Alouini,Digital Communication overFading Channels: A Unified Approach to Performance Analysis,John Wiley & Sons, New York, NY, USA, 2000.

    [20] C. Brunner, W. Utschick, and J. A. Nossek, Exploit-ing the short-term and long-term channel properties inspace and time: eigenbeamforming concepts for the BSin WCDMA,European Transactions on Telecommunications,vol. 12, no. 5, pp. 365378, 2001, special issue on Smart An-tennas,http://www.chrisbrunner.org.

    [21] J. Jelitto and G. Fettweis, Reduced dimension space-timeprocessing for multi-antenna wireless systems,IEEE WirelessCommunications, vol. 9, no. 6, pp. 1825, 2002.

    [22] F. Dietrich and W. Utschick, On the effective spatio-temporalrank of wireless communication channels, in Proceedings ofthe 13th IEEE InternationalSymposium on Personal, Indoor and

    Mobile Radio Communications, vol. 5, pp. 19821986, Lisbon,Portugal, September 2002.

    [23] T. Simunic, L. Benini, and G. De Micheli, Energy-efficientdesign of battery-powered embedded systems, IEEE Trans-actions on Very Large Scale Integration (VLSI) Systems, vol. 9,no. 1, pp. 1528, 2001.

    Constantin Siriteanu was born in Sibiu,

    Romania, in 1972. He received his B.S.and M.S. degrees in electrical engineering,from Gheorghe Asachi Technical Univer-sity, Iasi, Romania, in 1995 and 1996, re-spectively. Between 1995 and 1997, he was aPart-Time Engineer with the Research Insti-tute for Automation, Iasi, Romania, work-ing on data transmission over power lines.Between 1996 and 1998, he was a ResearchAssistant with the Department of Automatic Control and Com-puter Science, Gheorghe Asachi Technical University, Iasi, Ro-mania, working on digital control systems. Since 1998, he hasbeen a Research Assistant with the Department of Electrical andComputer Engineering, Queens University, Kingston, Canada. His

    Ph.D. research has been in adaptive signal processing for smartantenna array receivers, with a focus on performance-complexitytradeoffs based on channel statistics. Between 2004 and 2006, hehas also been a Course Instructor for the 4th year undergraduateElectrical/Computer Engineering Project Course at Queens Uni-versity.

    Steven D. Blostein received his B.S. degreein electrical engineering from Cornell Uni-versity, Ithaca, NY, in 1983, and the M.S.and Ph.D. degrees in electrical and com-puter engineering from the University ofIllinois at Urbana-Champaign, in 1985 and1988, respectively. He has been on the fac-ulty at Queens University since 1988 andhe currently holds the position of Professorand Head of the Department of Electricaland Computer Engineering. From 1999 to 2003, he was the Leaderof the Multirate Wireless Data Access Major Project sponsored bythe Canadian Institute for Telecommunications Research. He hasalso been a Consultant to industry and government in the areasof image compression and target tracking, and was a Visiting As-sociate Professor in the Department of Electrical Engineering atMcGill University in 1995. His current interests lie in the applica-tion of signal processing to wireless communications systems, in-cluding smart antennas, MIMO systems, and space-time frequencyprocessing for MIMO-OFDM systems. He served as Chair of IEEEKingston Section in 19931994, Chair of the Biennial Symposiumon Communications in 2000 and 2006, and as Associate Editor for

    IEEE Transactions on Image Processing from 1996 to 2000.

    James Millar has B.S. degree in electricalengineering and computer science from theUniversity of Western Ontario. He receivedhis M.S. degree from the University of Man-itoba. He previously worked with NortelNetworks providing carrier network designand planning services, and was an Instruc-tor of electronics at St. Lawrence Collegein Kingston, Canada. Currently he is work-ing with CMC Microsystems in Kingston,Canada, as a System-On-Chip Design Engineer with an interest insystem-level design for microsystems integration.

    http://www.chrisbrunner.org/http://www.chrisbrunner.org/http://www.chrisbrunner.org/