A real-time MIMO-OFDM mobile WiMAX receiver: … real-time MIMO-OFDM mobile WiMAX receiver:...

14
A real-time MIMO-OFDM mobile WiMAX receiver: Architecture, design and FPGA implementation q O. Font-Bach a,, N. Bartzoudis a , A. Pascual-Iserte a,b , D. López Bueno a a Centre Tecnològic de Telecomunicacions de Catalunya (CTTC), Parc Mediterrani de la Tecnologia (PMT), Av. Carl Friedrich Gauss 7, 08860 Castelldefels, Barcelona, Spain b Department of Signal Theory and Communications, Universitat Politècnica de Catalunya (UPC), Campus Nord, Jordi Girona 1-3, 08034 Barcelona, Spain article info Article history: Available online 30 March 2011 Keywords: MIMO Testbeds IEEE 802.16e Real-time systems FPGAs DSP abstract The IEEE 802.16e-2005 standard, also denoted as mobile WiMAX, was introduced as one of the first real efforts towards the deployment of fourth generation communication systems providing fixed and mobile broadband wireless access. Mobile WiMAX supports multiple input multiple output (MIMO) antenna techniques which are considered a key technology in wireless communication systems for increasing both data rates and system perfor- mance. This paper presents a real-time 2 2 MIMO mobile WiMAX receiver with a detailed description of the architecture, design and implementation steps. The complexity of the real-time baseband signal processing has been scaled-up due to the high channel bandwidth that was adopted. Numerous equipment and instrumentation comprising our high performance experimental MIMO testbed were used to validate the operation of the mobile WiMAX receiver. The paper includes a subset of results that demonstrate the system-performance using standard 2 2 MIMO mobile channels. Ó 2011 Elsevier B.V. All rights reserved. 1. Introduction The multiple input multiple output (MIMO) technology using multiple antennas at both the transmitter and recei- ver sides is widely proposed as one of the key techniques to enhance the link quality and/or improve the spectrum efficiency of cellular systems. However, increasing the performance of multi-antenna mobile terminal devices implies the use of processing-intensive algorithms at baseband. The selected signal processing solutions should satisfy a trade-off between performance, numerical stability and hardware efficiency. This obviously results in tremendous challenges for real-time hardware imple- mentations at baseband. The field programmable gate arrays (FPGAs) provide the necessary technology to deploy bit-intensive systems as long as the proposed algorithms are realistic for real-time implementations. The ability to verify the benefits of new MIMO techn- iques or the performance of new communication standards based on orthogonal frequency-division multiplexing (OFDM)-MIMO schemes is an emergent goal of both aca- demic and industrial research initiatives. Testbeds allow the validation of such developments in realistic environ- ments accounting for hardware limitations and software or coding constraints. In this paper we present GEDOMIS Ò (GEneric hardware DemOnstrator for MIMO Systems) a multi-antenna wireless communication testbed that enables the prototyping and evaluation of MIMO physical layer (PHY) algorithms. GEDOMIS Ò is currently hosting a real-time implementation of the IEEE 802.16e standard (i.e., mobile WiMAX) on a 2 2 MIMO configuration, 1389-1286/$ - see front matter Ó 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.comnet.2011.02.018 q This work was partially supported by the European Commission under projects NEWCOM++ (216715) and BuNGee (248267); by the Catalan Government under Grants 2009 SGR 891 and 2010 VALOR 198; and by the Spanish Government under project TEC2008-06327-C03 (MULTI-ADAPTIVE) and Torres Quevedo Grants PTQ-08-01-06441, PTQ06-02-0540, PTQ06-2-0553. Corresponding author. E-mail addresses: [email protected] (O. Font-Bach), [email protected] (N. Bartzoudis), [email protected] (A. Pascual-Iserte), dlopez@ cttc.cat (D.L. Bueno). Computer Networks 55 (2011) 3634–3647 Contents lists available at ScienceDirect Computer Networks journal homepage: www.elsevier.com/locate/comnet

Transcript of A real-time MIMO-OFDM mobile WiMAX receiver: … real-time MIMO-OFDM mobile WiMAX receiver:...

Page 1: A real-time MIMO-OFDM mobile WiMAX receiver: … real-time MIMO-OFDM mobile WiMAX receiver: Architecture, design and FPGA ... using multiple antennas at both the transmitter and ...

Computer Networks 55 (2011) 3634–3647

Contents lists available at ScienceDirect

Computer Networks

journal homepage: www.elsevier .com/locate /comnet

A real-time MIMO-OFDM mobile WiMAX receiver: Architecture, design andFPGA implementation q

O. Font-Bach a,⇑, N. Bartzoudis a, A. Pascual-Iserte a,b, D. López Bueno a

a Centre Tecnològic de Telecomunicacions de Catalunya (CTTC), Parc Mediterrani de la Tecnologia (PMT), Av. Carl Friedrich Gauss 7, 08860Castelldefels, Barcelona, Spainb Department of Signal Theory and Communications, Universitat Politècnica de Catalunya (UPC), Campus Nord, Jordi Girona 1-3, 08034 Barcelona, Spain

a r t i c l e i n f o

Article history:Available online 30 March 2011

Keywords:MIMOTestbedsIEEE 802.16eReal-time systemsFPGAsDSP

1389-1286/$ - see front matter � 2011 Elsevier B.Vdoi:10.1016/j.comnet.2011.02.018

q This work was partially supported by the Euunder projects NEWCOM++ (216715) and BuNGeCatalan Government under Grants 2009 SGR 891 aand by the Spanish Government under project(MULTI-ADAPTIVE) and Torres Quevedo GrantPTQ06-02-0540, PTQ06-2-0553.⇑ Corresponding author.

E-mail addresses: [email protected] (O. Font-Bach),(N. Bartzoudis), [email protected] (A. Pascttc.cat (D.L. Bueno).

a b s t r a c t

The IEEE 802.16e-2005 standard, also denoted as mobile WiMAX, was introduced as one ofthe first real efforts towards the deployment of fourth generation communication systemsproviding fixed and mobile broadband wireless access. Mobile WiMAX supports multipleinput multiple output (MIMO) antenna techniques which are considered a key technologyin wireless communication systems for increasing both data rates and system perfor-mance. This paper presents a real-time 2 � 2 MIMO mobile WiMAX receiver with adetailed description of the architecture, design and implementation steps. The complexityof the real-time baseband signal processing has been scaled-up due to the high channelbandwidth that was adopted. Numerous equipment and instrumentation comprising ourhigh performance experimental MIMO testbed were used to validate the operation ofthe mobile WiMAX receiver. The paper includes a subset of results that demonstrate thesystem-performance using standard 2 � 2 MIMO mobile channels.

� 2011 Elsevier B.V. All rights reserved.

1. Introduction

The multiple input multiple output (MIMO) technologyusing multiple antennas at both the transmitter and recei-ver sides is widely proposed as one of the key techniquesto enhance the link quality and/or improve the spectrumefficiency of cellular systems. However, increasing theperformance of multi-antenna mobile terminal devicesimplies the use of processing-intensive algorithms atbaseband. The selected signal processing solutions shouldsatisfy a trade-off between performance, numerical

. All rights reserved.

ropean Commissione (248267); by the

nd 2010 VALOR 198;TEC2008-06327-C03

s PTQ-08-01-06441,

[email protected]), dlopez@

stability and hardware efficiency. This obviously resultsin tremendous challenges for real-time hardware imple-mentations at baseband. The field programmable gatearrays (FPGAs) provide the necessary technology to deploybit-intensive systems as long as the proposed algorithmsare realistic for real-time implementations.

The ability to verify the benefits of new MIMO techn-iques or the performance of new communication standardsbased on orthogonal frequency-division multiplexing(OFDM)-MIMO schemes is an emergent goal of both aca-demic and industrial research initiatives. Testbeds allowthe validation of such developments in realistic environ-ments accounting for hardware limitations and softwareor coding constraints. In this paper we present GEDOMIS�

(GEneric hardware DemOnstrator for MIMO Systems) amulti-antenna wireless communication testbed thatenables the prototyping and evaluation of MIMO physicallayer (PHY) algorithms. GEDOMIS� is currently hosting areal-time implementation of the IEEE 802.16e standard(i.e., mobile WiMAX) on a 2 � 2 MIMO configuration,

Page 2: A real-time MIMO-OFDM mobile WiMAX receiver: … real-time MIMO-OFDM mobile WiMAX receiver: Architecture, design and FPGA ... using multiple antennas at both the transmitter and ...

O. Font-Bach et al. / Computer Networks 55 (2011) 3634–3647 3635

featuring a bandwidth of 20 MHz and using matrix-A encod-ing based on Alamouti’s space–time block code (STBC) [1] ina per carrier basis. The physical layer algorithms of the recei-ver were modeled in Matlab, designed in VHDL and imple-mented using a real-time FPGA platform.

The aim of our work is twofold; a principle objective isto present the baseband algorithms that are necessary toefficiently design the architecture of a real-time MIMOmobile WiMAX receiver; another important goal is theanalysis of the design, implementation and debuggingissues that have to be considered when embarking in thechallenging task of building a real-life wireless broadbandcommunication system.

2. Review of the state-of-the-art

The first wave of MIMO-OFDM technology is found intestbeds featuring the IEEE 802.11 family of standards, com-monly referred as wireless local area network (WLAN) [2,3].The 3rd Generation Partnership Project (3GPP) Long TermEvolution (LTE) has also embraced the MIMO-OFDM tech-nology. A real-time 2 � 2 pre-LTE MIMO software-radiotestbed implemented using FPGA and DSP technology isdescribed in [4]. In [5] a real-time 12 � 12 MIMO-OFDMLTE testbed (with 20 MHz bandwidth), is implementedusing cell-processors; however, its description does notreveal the underlying immature microprocessor technologyin terms of available pre-verified functions, design softwareand coding peculiarities. Another LTE-based real-time 2 � 2MIMO-OFDM system with 20 MHz bandwidth is presentedin [6]. This work can be considered as a prime reference inthe field of experimental research using MIMO testbeds,demonstrating the functionality of a base station (BS) andseveral mobile subscribers in an urban environment usinga wideband channel emulator.

There are various other MIMO testbeds in the literaturethat make certain assumptions or simplifications to scale-down the processing complexity at baseband. For instance,[7] presents a FPGA implementation of a low bandwidth,real-time 2 � 2 MIMO system, using a software-generatedindoor channel. Another indicative example is presented in[8] involving a low-bandwidth 2 � 2 MIMO testbed imple-menting offline baseband processing.

A well established wireless open-access research plat-form, namely WARP [9] has been adopted by variousresearchers. An indicative example is presented in [10]where the WARP testbed realizes a real-time OFDM-basedcooperative system, using a distributed version of Alamo-uti’s block code, to analyze its capacity versus a 2 � 1 mul-tiple input single output (MISO) system. Another initiativeis the Vienna MIMO testbed [11] which appears in numer-ous recent publications like [12], where the PHY through-put of a 2 � 2 MIMO fixed WiMAX system featuringAlamouti coding is measured in an urban and an alpinescenario.

Several other fixed WiMAX implementations wereencountered focusing on the characterization of certainchannels with in-the-field test-campaigns. A 2 � 2 MIMOtestbed featuring offline baseband processing is presentedin [13]. The proposed system is used to evaluate the perfor-

mance of differential STBC (DSTBC) transmissions over in-door channels. In [14] a real-time, low bandwidth 2 � 2MIMO testbed, build with FPGAs, is used to carry-outchannel capacity measurements. In [15] path loss mea-surements are conducted in a rural environment using aWiMAX-based network.

Although mobile WiMAX is currently considered a ma-ture technology, there are still scarce sources in the litera-ture, mainly presenting single antenna implementations[16] with offline baseband processing [17]. Real-time mo-bile WiMAX testbeds featuring MIMO are even moreuncommon in the literature. We have only encounteredpartial FPGA-based implementations presenting ambigu-ous performance or results. For instance, in [18] a real-time2 � 2 MIMO fixed sphere decoder (FSD) is implemented ina FPGA, while the rest of the system is developed in Matlab.In [19] a 2 � 2 MIMO system with 10 MHz bandwidthusing a single cell processor, presents the IEEE 802.16eorthogonal frequency-division multiple access (OFDMA)PHY for a BS transceiver. Nevertheless, we could not effi-ciently evaluate this work due to the absence of mobilityin the channels under test. Finally, in [20] the acclaimedFPGA-based transceiver implementation of a 2 � 2 MIMOsystem presents limited simulation and emulation resultsof a partially described system-architecture.

2.1. Contribution

The contribution of our real-time broadband MIMO-OFDM implementation compared to other mobile WiMAXphysical layer implementations is explained in the follow-ing lines (comparison with the WLAN implementations isomitted, since their scope and operating requirementsare different). Although our implementation is lackingthe full scalability defined by the IEEE 802.16e-2005 stan-dard, it is not merely an isolated hardware acceleration ofspecific PHY-layer algorithms (i.e., the mobile WiMAXimplementations quoted previously); our solution is a pro-totype of the complete PHY-layer processing-chain of aMIMO-OFDM mobile WiMAX receiver. On top of this, oursystem was designed with the required modularity, whichenables and facilitates the aggregation of scalable OFDMfeatures (the limitations of our proposed solution inrespect to the implementation overheads of the mobileWiMAX OFDM scalability are discussed in the followingsections).

Furthermore, the system described herein is a real-timeFPGA implementation capable of carrying out challengingbaseband signal processing on-the-fly, while offering therequired design modularity and structure to host a varietyof advanced MIMO schemes (e.g., an open-loop MIMO con-figuration was implemented and presented herein). Thiscan be considered an important operating difference whencompared to the already presented non real-time imple-mentations, which are easy, fast and cheap to deploy butfeature as well significant downsides (e.g., processing oflong data frames becomes unaffordable, closed-loopschemes cannot be directly evaluated).

A key difference of our work compared to that ofother authors is the fact that their implementations areapplying certain simplifications that facilitate their rapid

Page 3: A real-time MIMO-OFDM mobile WiMAX receiver: … real-time MIMO-OFDM mobile WiMAX receiver: Architecture, design and FPGA ... using multiple antennas at both the transmitter and ...

3636 O. Font-Bach et al. / Computer Networks 55 (2011) 3634–3647

prototyping needs; this increases the functional imple-mentability gap of their work compared to a close toreal-world implementation like ours. The mentionedsimplifications are mainly materialized by neglecting orunderestimating the implementation complexity over-head, the precise signal model (including its impairments),the effects of using fixed-point logic, the need of includingintelligent processing blocks in the control plane andmemory management interfaces (paramount requirementin real-time systems) and finally by using channel modelswith reduced complexity (or even the absence of a realisticchannel model).

It is also worth mentioning that both the currentWiMAX IPs targeting mobile terminals and the encoun-tered experimental MIMO mobile WiMAX testbeds,operate at 10 MHz. The system described in this paper isusing a 20 MHz channel bandwidth, a state-of-the-artfeature that exceeds the WiMAX Forum Radio Confor-mance Tests. It has to be underlined that this bandwidthforms part of mandatory implementation-specificationsthat will be introduced in the IEEE 802.16 m version ofthe standard (i.e., in 2012).

1 In addition to PUSC, the AMC permutation scheme has been imple-mented and validated only for the SISO mobile WiMAX receiver. Thus, theresults are not included in this paper.

3. Mobile WiMAX: outline of the physical layerspecifications

The IEEE 802.16e-2005 standard, on top of variousenhancements of its previous version regarding stationaryoperations, supports mobile subscriber stations at vehicu-lar speeds and thus specifies a system for combined fixedand mobile broadband wireless access. The PHY layer ofthe mobile WiMAX supports scalable OFDMA architec-tures. The scalability is achieved by modifying the fast Fou-rier transform (FFT) size, a feature that facilitates thesupport of various channel bandwidths (i.e., from 1.25 to20 MHz). Mobile WiMAX also supports adaptive modula-tion and coding (AMC), various subchannelization permu-tation techniques and MIMO-aided transmit/receivediversity. OFDM is used for both downLink (DL) and upLink(UL) transmissions.

In order to create the OFDM symbol in the frequencydomain, the modulated symbols are mapped onto the sub-channels that have been allocated for the transmission ofthe data block. A subchannel, as defined by IEEE 802.16e-2005, is a logical collection of subcarriers. The numberand distribution of the subcarriers that comprise a sub-channel depends on the permutation mode. The numberof subchannels allocated for transmitting a data block de-pends on various parameters, such as the size of the datablock, the modulation format, and the coding rate. Thereare numerous variations of permutation schemes definedin the standard. The OFDMA frame may include multiplezones, hosting different permutation schemes with partialusage of subchannels (PUSC) being the one utilized in oursystem. The 802.16e makes use of different MIMO tech-niques, such as STBC, beamforming and spatial multiplex-ing (SM).

Our system as indicated in Fig. 1b adopts only a fixedsubset of this flexible configuration of the PHY layer. Theconfiguration parameters of the OFDM frame used in our

testbed (DL), are defining a single burst with a fixed prede-fined format (i.e., FCH and DL-MAP are not decoded) asseen in Fig. 1a. The selected encoding scheme is Alamouti’sSTBC (defined as matrix A in the WiMAX standard). Thescalability of the OFDM-based mobile WiMAX standard isadds a top-up complexity in the control plane (i.e. differentpermutation schemes,1 variable length of the cyclic prefix(CP), different modulation scheme, etc.). However, it isimportant to underline that the bulk of the data plane pro-cessing challenges have been met by using the high band-width of 20 MHz. The latter dramatically increased thedesign and implementation considerations at baseband,since it implied additional processing complexity and mem-ory requirements.

The first processing block of a WiMAX transmitter asdefined by the standard is a randomizer that pseudo-ran-domly scrambles input data. The modulator maps data intoconstellation points and the subcarrier mapper allocatesthe symbols to the corresponding subcarriers. A preamblegenerator produces a symbol that precedes the burst, facil-itating the timing synchronization. The pilot subcarriermapper inserts pilot subcarriers into each data burst. Theinverse FFT transforms the frequency-domain signal intoa time-domain signal and a CP is inserted to obtain thecomplete OFDM signal. The standard does not define theexact structure of the receiver, which in all respects followsthe reverse signal processing sequence from the one de-scribed in the signal transmitting stage.

3.1. MIMO technology

One of the most common problems faced by designersof wireless communication systems is the phenomenonof fading that arises due to the spatio-temporal variationsof the wireless channel. This is inevitable in wave-reflect-ing and scattering environments that are subject tochanges over time. The multiple received versions causedby reflections are referred to as multipath and can eventu-ally produce a deep fade in the signal.

One of the main proposed techniques to tackle this ef-fect is found in MIMO systems, which comprise multipleantennas at the transmitter and at the receiver sides.MIMO systems use diversity techniques to mitigate theeffects of fading by providing multiple copies of the samesignal. The use of multiple antennas dramatically reducesthe probability of simultaneous deep-fades in all the re-ceive antennas. MIMO technology may also be exploitedto implement SM that significantly increases the spectralefficiency, and hence the capacity of a wireless communi-cation system. SM realizes high data rates by transmit-ting independent information streams in parallel overdifferent transmit antennas. Hence, MIMO technologyfeatures a trade-off between quality of service providedin diversity schemes and high data rates provided bySM [21].

There are different ways and transmission strategies tocapitalize the benefits of diversity, which primarily depend

Page 4: A real-time MIMO-OFDM mobile WiMAX receiver: … real-time MIMO-OFDM mobile WiMAX receiver: Architecture, design and FPGA ... using multiple antennas at both the transmitter and ...

Fig. 1. The OFDM frame-definition and main specifications of our system.

O. Font-Bach et al. / Computer Networks 55 (2011) 3634–3647 3637

on the degree of knowledge of the channel response, i.e.,the channel state information (CSI). In order to get suchCSI at the transmitter, when channel reciprocity does notapply, a feedback channel from the receiver to the trans-mitter can be implemented.

An indicative MIMO transmission technique that doesnot require channel knowledge at the transmitter isspace–time coding (STC), which utilizes both Block [22](STBC) and Trellis [23] (STTC) codes. Alamouti’s block code,which benefits from its inherent orthogonality and uses 2antennas at the transmit side, has become increasinglypopular among other codes because of its optimal andlow-complexity decoding stage at the receiver.

4. Receiver architecture and design

4.1. Received signal model

The WiMAX signal comprises frames which encapsulateuser data and silence periods which are inserted betweenthese frames. During the silence periods the receiver iscontinuously monitoring the incoming signal in order todetect the beginning of the following frame using a syn-chronization algorithm. In a real-world MIMO-OFDM test-bed, the received radio frequency (RF) signal, on top of thenoise, is impaired due to the performance characteristics ofthe equipment used (e.g., channel emulator, RF front-end,baseband signal processing boards, etc.). If the receivedsubcarriers lose their orthogonality due to analog and RFimpairments, the performance of the MIMO-OFDM systemdegrades dramatically. Thus, such signal-impairmentshave to be determined and removed before making thesymbol decisions.

The specifications and performance of the equipmentcomposing the GEDOMIS� testbed (as detailed in Section 5)allows our signal model to safely ignore certain negligiblesignal-impairments such as: the in-phase and quadrature(I/Q) gain and phase imbalances, the inaccuracy betweenthe sampling clocks of the transmitter and receiver in re-spect to the ideal sampling frequency, the local oscillator(LO) drifts and finally, and the random phase noise dueto LO instability. The resulting received signal model atthe output of the RF down-converters at the ith receiveantenna can be expressed as:

ciðtÞ ¼ RfxiðtÞ � ej2PðfIFþDf Þtg þ Ai þ Bi � cosð2PðfIF

þ Df Þt þuiÞ þwiðtÞ; ð1Þ

where xi(t) represents the useful part of the received base-band signal, fIF is the intermediate frequency (IF), Df is thecarrier frequency offset (CFO), Ai is the direct current (DC)level introduced by the baseband board chassis,Bi � cos(2P(fIF + Df)t + ui) represents the unwanted residualcarrier, located at the center of the useful signal-spectrum(i.e., introduced by the LO coupling at the transmitter) andfinally, wi(t) is the Gaussian noise. The useful part of the re-ceived baseband signal at the ith receive antenna can beexpressed as follows:

xiðtÞ ¼XnT

j¼1

~xjðtÞ H Hi;jðtÞ; ð2Þ

where ~xjðtÞ is the equivalent baseband signal transmittedfrom the jth transmit antenna, with nT being the numberof assumed transmit antennas, and Hi,j(t) is the equivalentbaseband of the time impulse response of the MIMO chan-nel between the jth transmit antenna and the ith receiveantenna.

4.2. Design of the processing blocks at baseband

An indicative representation of the receiver processingblocks at baseband is shown in Fig. 2. The algorithmic foun-dation and functionality of each block consisting the mobileWiMAX physical layer will be explained in the followingsubsections. The entire system including the transmitter,channel and receiver was initially modeled in Matlab.Due to the bit-intensive nature of the physical layer algo-rithms utilizing the MIMO technology at the receiver andthe real-time system-constraints, a suitable processingplatform with FPGA devices was selected to implementthe receiver. The selected Matlab-based algorithms wereimplemented by writing custom VHDL code following aregister transfer level (RTL) design approach. The Matlabmodel and the VHDL implementation were co-simulatedto verify the system precision and overall performance.

4.2.1. RF front-end and analog to digital conversion (ADC)The tasks performed at the receiver’s RF front-end is the

low-noise amplification, the downconversion from RF to IF

Page 5: A real-time MIMO-OFDM mobile WiMAX receiver: … real-time MIMO-OFDM mobile WiMAX receiver: Architecture, design and FPGA ... using multiple antennas at both the transmitter and ...

Fig. 2. Signal acquisition and baseband processing architecture.

3638 O. Font-Bach et al. / Computer Networks 55 (2011) 3634–3647

(i.e., in our case centered at fIF = 156.8 MHz) and the sup-pression of out-of-band unwanted signals such as noiseand spurs.

Each active component in the receiver chain has alimited dynamic range. Thus, signals exceeding this rangeare subject to saturation or clipping. Since saturation is adetrimental factor of the system-performance, counter-measures should be taken to prevent it. Moreover, the ac-tive RF or baseband processing components are subject tothermal or other types of noise. The dynamic range in thebaseband part of the receiver may potentially be affectedby the presence of a DC offset. Static DC offsets occur dueto bias mismatches in the baseband boards, but they couldbe generated as well by LO coupling at the RF transmitteror self-mixing of the LO signals at the RF receiver.

Sampling an analog signal at IF results in replicas of thesignal’s spectrum which are repeated at uniform intervals.The choice of the sampling rate of such signals is depen-dent on the signal’s bandwidth and the IF center frequency.The chosen bandpass sampling architecture requires onlyone analog to digital converter (ADC) for the final IF tobaseband conversion to occur in the digital domain.

The ADC is performed using under-sampling and takingas ADC sampling rate fs = 89.6 MHz. Therefore, after theADC, one of the aliases of the discrete signal will be locatedat 22.4 MHz, which is the baseband sampling frequency ofthe receiver as described by the WiMAX standard. Thedigital spectrum after the sampling is depicted in Fig. 3.The delta at baseband represents the DC coupling of the

baseband hardware which has to be taken into accountin the design of the baseband signal processing stages.The delta at the center of the signal spectrum representsthe coupling of the analog LOs at both the up and down-converters.

4.2.2. Automatic gain controlThe automatic gain control (AGC) is an analog–digital

hybrid processing block providing an interface betweenthe FPGAs and the RF front-end. The programmable gainamplifier PGA is a digitally-controlled analog circuit witha discrete set of possible gain values, while the algorithmthat decides the new gain value of the PGA is implementedin the digital domain.

The correct operation of the AGC is a decisive factor forthe overall performance of a mobile receiver. The AGC ad-justs in a timely manner the power-level of the input IFsignal to utilize the full dynamic range of the ADC andovercome the variations caused by the mobile channel fad-ing. Frame-based OFDM systems are specially prone tohigh peak-to-average power ratio (PAPR); the inclusion ofback-off margin that prevents signal clipping is thereforea prerequisite. The ADC device is indicating its saturationwith a state signal. When saturation occurs the AGC doesnot forward data until the signal is attenuated at an opti-mal dynamic range.

The heart of the AGC algorithm is a signal peak-detec-tor, operating in a per-frame basis (i.e., fixed gain for an en-tire frame), which provides a baseline trade-off between

Page 6: A real-time MIMO-OFDM mobile WiMAX receiver: … real-time MIMO-OFDM mobile WiMAX receiver: Architecture, design and FPGA ... using multiple antennas at both the transmitter and ...

Fig. 3. Spectrum of the digital signal after sampling at 89.6 MHz.

O. Font-Bach et al. / Computer Networks 55 (2011) 3634–3647 3639

implementation complexity and efficiency. In a frame-based communication system like mobile WiMAX, wherethe channel varies rapidly in high mobility conditions,the AGC algorithm has a very limited timing budget tooperate. This is because the AGC must calculate the gainof the next frame and apply it to the PGA registers duringthe inter-frame silence period (i.e., taking into accountthe peak value of the previous data frame).

The PGA used in our receiver has 16 gain steps with1.5 dB of separation (i.e., resolution of the gain correc-tions). Thus, starting from the gain value applied duringthe previous frame, the optimal adjustment of the IF inputpower level during the following frame, DG, is calculatedas follows:

DG ¼ 10 � log10

v2FS

v2BM

� �v2

PK

dB ¼ 10 � log10g dB; ð3Þ

where vFS is the digital full scale of the quantizer in theADC, vBM accounts for the back-off safety margin, andv2

PK ¼ maxjci½n�j2, with ci[n] representing the samples atthe output of the ADC during the previous frame. In orderto minimize the processing complexity and the implemen-tation latency, we have calculated the contents of a look-up table (LUT) that correlates all the possible values of gin relation to the applicable gain-corrections (i.e., numberof steps, DG).

4.2.3. Digital down converterThe digital down converter (DDC) implements three

functions: channel frequency translation, I/Q componentsextraction and signal decimation. The direct digital synthe-sizer (DDS) component of the DDC translates any fre-quency band within the analog bandwidth of the ADCsdown to zero frequency (i.e., baseband), while a complexfinite impulse response (FIR) low-pass filter is responsiblefor eliminating out-of-band components. Finally, an outputdecimator and formatter, which keeps one out of everyfour samples, delivers the complex representation of thedigitalized signal whose spectrum is shown in Fig. 3.

The digital filtering stage of the DDC is important be-cause it prevents aliasing during the sub-sampling process.Hence, it is critical to account for the system-wide signalimpairments when designing the digital filter. For thisreason, the bandpass and reject frequencies should becarefully selected keeping the useful signal spectrum intactwhile at the same time eliminating the effects of the DClevel which is an inherent feature of the baseband process-ing boards (e.g., the DC is transformed to a synchroniza-tion-altering sinusoid which is eventually filtered). Thedesigned low-pass filter has 103 coefficients and is imple-mented jointly with the decimation stage as a polyphase

decimator filter. Fig. 4 shows the frequency domain repre-sentation of the operations performed within the DDC.

The output frequency of the DDS, fDDS, is controlled bythe phase increment, Dh, which is related to fDDS by:

fDDS ¼fs � Dh

2BhðnÞHz; ð4Þ

where fs is the ADC sampling rate, as described in Sec-tion 4.2.1, and Bh(n) is the resolution in bits of the internalaccumulator used in the DDS (32 bits in our case). Onpower-up the fDDS is tuned to 22.4 MHz and then is con-stantly updated in real-time to compensate the effects ofthe CFO:

fDDS ¼ 22:4þ Df MHz; ð5Þ

Df represents the CFO that is defined in terms of the sepa-ration between adjacent subcarriers:

Df ¼ a22:4 � 106

2048; ð6Þ

where a is the CFO normalized with respect to the intercar-rier separation (i.e., in practice the CFO will not be higherthan one half the intercarrier separation, a 2 [�0.5,0.5]),22.4 MHz is the sampling frequency and 2048 is the FFTsize. Combining (4)–(6) the phase increment is given by:

Dh ¼ 22:4þ a � 22:42048

� �� 232

89:6¼ 230 þ a � 219 ð7Þ

4.2.4. Synchronization, CFO estimation and correctionThe WiMAX frame comprises several OFDM symbols; in

our system-configuration 46 symbols are used for the dataeach one having 2560 samples. Symbol detection is re-quired to properly locate the FFT window of the samplescorresponding at each OFDM symbol. This is feasible withthe inclusion of a CP at the beginning of each OFDM sym-bol. Considering channels with a maximum delay spread of2510 ns, only 455 out of the 512 samples in the CP can beused for the timing synchronization (i.e., the remaining 57samples are discarded to avoid unreliable operation of theFFT window-locator). The implemented synchronizationtechnique is based on a sliding window of 2048 + 455 sam-ples, which allows us to calculate the cross-correlation oftwo groups of 455 samples (having a separation of 2048samples). The expression corresponding to the square ofthe correlation when the sliding window starts at the nthsample is given by:

jrs½n�j2¼jPnR

i¼1

P454l¼0 s�i ½nþ l� �si½nþ lþ2048�j2PnR

i¼1

P454l¼0 jsi½nþ l�j2

� �� ðPnR

i¼1

P454l¼0 jsi½nþ lþ2048�j2Þ

;

ð8Þ

Page 7: A real-time MIMO-OFDM mobile WiMAX receiver: … real-time MIMO-OFDM mobile WiMAX receiver: Architecture, design and FPGA ... using multiple antennas at both the transmitter and ...

Fig. 4. Frequency representation of the operations performed in the DDC.

3640 O. Font-Bach et al. / Computer Networks 55 (2011) 3634–3647

where si[n] is the equivalent complex baseband signal atthe output of the DDC, sampled at 22.4 MHz, at the ith re-ceive antenna processing branch, and nR denotes the num-ber of receive antennas (i.e., nR = 2 in our case). Due to theextremely resource-demanding implementation imposedby (8) the following simplification in terms of complexitywas applied:

jrs½n�j2 ¼jdn½n�j2

ds0½n� � ds1½n� ; ð9Þ

where:

dn½nþ 1� ¼

dn½n� þPnR

i¼1s�i ½nþ 455� � si½nþ 2048þ 455�

if n 6 455;

dn½n� �PnR

i¼1s�i ½n� � si½nþ 2048�

þPnR

i¼1s�i ½nþ 455� � si½nþ 2048þ 455�

if n > 455;

8>>>>>>>>>>>>><>>>>>>>>>>>>>:

ð10Þ

with dn[0] = 0. It should be noted that ds0[n],ds1[n] are cal-culated in a similar manner. With this optimization onlyfour samples need to be introduced to the already calcu-lated correlation. A peak in jrs[n]j2, indicates the detection

Fig. 5. Architecture of the s

of the symbol and thus the sample where the CP starts, i.e.,poscp = argmaxn jrs[n]j2. Additionally the phase of the corre-lation (i.e., the numerator of jrs[n]j2), at poscp can be used toestimate the phase shift of the received signal in the pres-ence of CFO. Using the notation given in (6), the phase shiftbetween two signal samples delayed by 2048 positions isequal to ej2PDftjt¼2048 1

22:4�106¼ ej2Pa. Therefore, the estimated

CFO or, equivalently, the constant a can be defined as:

a ¼ 12P

]ðrs½poscp�Þ ¼1

2P]ðdn½poscp�Þ; ð11Þ

where a can be calculated using a coordinate rotation dig-ital computer (CORDIC) algorithm. As already mentioned,the estimated CFO is used to fine tune the DDS.

The architecture of the proposed synchronizationscheme is shown in Fig. 5. Due to the stringent real-timeconstraints of our system, a pipelined structure has beendeployed requiring a latency of more than one clock cycle.first in first out (FIFO) memories provide a latency-levellertemporary storage for the incoming signal. A custom de-sign for such memories allows the retrieval of the four spe-cific samples employed to calculate the cross-correlation ateach clock cycle. Additionally, once the location of the firstsample of each OFDM symbol is determined by the data-forwarding control logic, the window of subcarriers com-

ynchronization block.

Page 8: A real-time MIMO-OFDM mobile WiMAX receiver: … real-time MIMO-OFDM mobile WiMAX receiver: Architecture, design and FPGA ... using multiple antennas at both the transmitter and ...

O. Font-Bach et al. / Computer Networks 55 (2011) 3634–3647 3641

posing this symbol is synchronously forwarded from theFIFOs.

The detection of the correlation peak is a critical part ofthe synchronization algorithm because it is error-proneunder the presence of spurious signals. In our system, weexperience a parasitic sinusoid because of the presence ofCFO drifts and due to the DC-level created by the digitalmixing of the unwanted residual carrier with a digitallygenerated sinusoid at the DDS stage. The presence of thissinusoid during the silence period can result in erroneousperformance of the aforementioned correlator, whichmay indicate the presence of peaks and consequently mis-place the window of samples forwarded to the FFT process-ing block. Failure to prevent an erroneous symboldetection may render the system unusable. Thus, the peakdetection algorithm, which is usually based on a triggeringthreshold and the selection of the maximum value in awindow, must be optimized to recognize the legitimatepeaks. When the trigger issues a correlation value abovethe threshold, the shape of the correlation curve deter-mines whether the located peak indicates the beginningof an OFDM symbol or a silence period. The correlationcurve tends to have high-values and nearly no variationsduring the silence periods (Fig. 6); on the other hand it pre-sents high values only during the processing of an OFDMsymbol while processing the CP (Fig. 7).

4.2.5. Pilot extraction and channel estimationThe channel estimation in our receiver is based on the

pilot subcarriers that are being transmitted in each OFDMsymbol. Let Si[k] be the kth subcarrier in the OFDM symbolreceived by the ith receive antenna after the FFT (i.e.,

Fig. 6. DC level in the

Fig. 7. DC level in the received signal

Si[k] = FFT(si[n])), with k 2 [0� � �nU � 1], where nU representsthe number of subcarriers used to transmit user data andpilot tones. The IEEE 802.16e standard defines the value,pv ¼ 4

3, and location (i.e., frequency), pk,j, of such special sub-carriers for each transmit antenna j, i.e., pk,j 2 [0� � �nU � 1].The number and distribution of pilot subcarriers dependson the subcarrier permutation scheme. When the PUSCpermutation scheme is used, clusters of 14 contiguoussubcarriers are defined, with two of them used to transmitpilot tones. Additionally, out of the 2048 subcarriersavailable at each OFDM symbol, 1440 will be used for datatransmission and 240 for pilot tones transmission, i.e.,nU = 1679, while the rest are being utilized for the guard-bands and DC carrier. When nT = 2 the PUSC permutationscheme distributes the pilot tones for each antenna in twoconsecutive OFDM symbols; this imposes an implementa-tion constraint of storing two complete OFDM symbolsper receive antenna enabling in this way the pilot-basedchannel estimation. This also implies that the channel esti-mation will be applied in pairs of consecutive OFDM sym-bols (i.e., the estimated channel frequency response willbe the same for both). Fig. 8 shows the detailed clusterstructure, which is cyclically repeated each four OFDM sym-bols, where On 2 [0� � �23] is the index of the transmittedOFDM symbol pair within the frame and Txi represents theith transmit antenna. Therefore, the cluster structure de-fines pk,j for each OFDM symbol and each transmit antennaj. Note that when S1[k] is used as a pilot tone then no trans-mission occurs for S2[k] (i.e., null subcarrier) and vice versa,to avoid interferences in the pilot positions.

Each processing branch of the MIMO-enabled receiverhas to estimate the corresponding channels from all

received signal.

with optimized peak location.

Page 9: A real-time MIMO-OFDM mobile WiMAX receiver: … real-time MIMO-OFDM mobile WiMAX receiver: Architecture, design and FPGA ... using multiple antennas at both the transmitter and ...

Fig. 8. Distribution of the pilot subcarriers in the PUSC permutation.

3642 O. Font-Bach et al. / Computer Networks 55 (2011) 3634–3647

transmit antennas. First, the channel frequency response atthe pilot tones, eHi;j½pk;j�, is estimated as follows:

eHi;j½pk;j� ¼Si½pk;j�

43

; ð12Þ

where Si[pk,j] represents the kth pilot tone from the jthtransmit antenna after the FFT in the ith receive antennaprocessing chain, with j 2 [1,2] in our case. Thus, eHi;j½pk;j�is a discrete function calculating the channel frequency re-sponse at the pilot tones between the ith receive antennaand the jth transmit antenna. An interpolation of the pilotpositions is then required to estimate the channel at thefrequencies where data subcarriers were transmitted foreach transmit-receive antenna pair. After comparingdifferent algorithms, we have selected a second orderpolynomial interpolation, which provides the best trade-off between accuracy and implementation complexity(accounting for the number of pilot tones in each OFDMsymbol and the channel specifications). The channel fre-quency response for the data subcarriers is calculated asfollows:

eHi;j½k� ¼ eHi;j½pc1 ;j� þeHi;j½pc2 ;j� � eHi;j½pc1 ;j�

pc2 ;j � pc1 ;j� ðk� pc1 ;jÞ

þ

eHi;j ½pc3 ;j��eHi;j ½pc2 ;j

�pc3 ;j�pc2 ;j

� �eHi;j ½pc2 ;j

��eHi;j ½pc1 ;j�

pc2 ;j�pc1 ;j

pc3 ;j � pc1 ;j� ðk� pc1 ;jÞ � ðk� pc2 ;jÞ;

ð13Þ

Fig. 9. Architecture of the ch

where pcr ;j represents the location of one of the three clos-est pilot tones, originated from transmit antenna j, to Si[k],respectively for each r 2 [1� � �3].

The general processing architecture of the channel esti-mation is depicted in Fig. 9. The subcarriers are stored intwo separated memories, considering that the channelestimation only commences when all the pilot carriers oftwo OFDM symbols are received. First, the channel fre-quency response at the pilot tones is estimated. A carefullydesigned memory system is required to store the calcu-lated coefficients, until the whole eHi;j½pk� is computed.The calculation of the estimated channel frequency re-sponse coefficients is implemented using groups of threeneighboring pilots (i.e., eHi;j½pcr

�); their locations withinthe memory will be indicated by their associated indexes(i.e., FFT output index of the pilot tone), and replicated inthree memory blocks, avoiding in this way extra latenciesand computational complexity in the memory-manage-ment plane. Once eHi;j½pk� is available the computation ofthe interpolation is performed. It must be noted thatincoming subcarriers have lost their sequential order foreach data subcarrier after the FFT calculation, thereforean algorithm is required to calculate the indexes of thethree closest pilot subcarriers, pcr

, accounting for the posi-tion of the actual carrier at the output of the FFT. A FIFOmemory component is used to compensate the latencyintroduced by the calculations and provide an alignedoutput.

annel estimation block.

Page 10: A real-time MIMO-OFDM mobile WiMAX receiver: … real-time MIMO-OFDM mobile WiMAX receiver: Architecture, design and FPGA ... using multiple antennas at both the transmitter and ...

Fig. 10. The GEDOMIS� testbed setup.

O. Font-Bach et al. / Computer Networks 55 (2011) 3634–3647 3643

4.2.6. Matrix A space–time decodingThe Alamouti’s STC (matrix A) transmission scheme en-

codes data symbols, dk, in pairs and distributes them ingroups of two OFDM symbols. In order to decode and esti-mate the transmitted data symbols the following opera-tions are applied:

d̂k½2On� ¼PnR

i¼1eH�i;1½k� � Si½k;2On� þ eHi;2½k� � S�i ½k;2On þ 1�PnR

i¼1jeHi;1½k�j2 þ jeHi;2½k�j2;ð14Þ

d̂kþ1½2Onþ1� ¼PnR

i¼1eH�i;2½k� � Si½k;2On� � eHi;1½k� � S�i ½k;2On þ 1�PnR

i¼1jeHi;1½k�j2 þ jeHi;2½k�j2;

ð15Þ

where On and On + 1 represent the indexes of the two con-secutive OFDM symbols.2 Note that in (14) and (15) it is as-sumed that the gain applied by the AGC to the incomingsample-streams is equal for both receive antennas.

Taking into account the implemented channel estima-tion for the data subcarriers it is required to store oneOFDM symbol per receive antenna before applying on-the-fly the space–time decoding during the reception of

2 The testbed uses a 2 � 2 MIMO scheme applying an Alamouti’s STBC(matrix A). Increasing the number of antennas would only imply a linearincrease of the complexity of the space–time block decoding (with respectto the number of antennas), as long as an orthogonal STBC is still applied atthe transmitter. Moreover, our testbed has certain hardware constraintsthat prevent the implementation of higher order MIMO schemes (i.e., FPGAcapacity, maximum number of simultaneous channels that can beemulated in our channel emulator and absence of the required instrumen-tation for generating a MIMO signal having more than two antennas).

the second OFDM symbol of the pair. In other words, fourdata symbols will be estimated at each operation of theblock, with an initial latency equal to the length of anOFDM symbol. It is worth mentioning that there is no needto store the channel coefficients for the first OFDM symbolof the pair, because they are the same as the posterior one(i.e., the pilot tones are distributed in two OFDM symbols).

The processing stages that follow the space–timedecoding calculations are related to the PUSC permuta-tion,3 clustering, channelization and mapping of the datasymbols, which enable the recovery of the originally trans-mitted bit sequence.

5. GEDOMIS� testbed description

A graphic-overview of the GEDOMIS� testbed setup fora point-to-point 2 � 2 MIMO system is shown in Fig. 10.The baseband part of the transmitter was designed in Mat-lab. The separate I/Q baseband outputs of this model arewritten to data-files, which are fed to two instances of Agi-lent’s Signal Studio Toolkit. The data-files are then down-loaded to two ESG4438C instruments configuring in thisway the operating parameters of the transmitter. Sincethe two ESG4438C need to be time and phase aligned forthe MIMO signal generation, several adjustments were ap-plied (i.e., master–slave connection of the instruments,

3 Implementing more permutation schemes would merely involve thedesign of additional blocks with control logic whose complexity in terms ofimplementation and cost is much lower than the remaining of the systemdetailed in this paper.

Page 11: A real-time MIMO-OFDM mobile WiMAX receiver: … real-time MIMO-OFDM mobile WiMAX receiver: Architecture, design and FPGA ... using multiple antennas at both the transmitter and ...

12 14 16 18 20 22 24 26−24

−22

−20

−18

−16

−14

−12

−10

mean received SNR per antenna (dB)

mea

n EV

M (d

B)

ITU−T vehicular A (static)ITU−T pedestrian B (static)

12 14 16 18 20 2210−5

10−4

10−3

10−2

mean received SNR per antenna (dB)

mea

n R

AW

BER

ITU−T vehicular A (static)ITU−T pedestrian B (static)

Fig. 11. Performance of the testbed in quasi-static channel scenarios.

4 Channel coding is a standard processing block that has similarimplementation issues with other broadband wireless communicationstandards. Taking into account the previous argument, the focus of thecontribution of this paper and the limitations of our hardware platform, wehave omitted the inclusion of channel coding in our system.

3644 O. Font-Bach et al. / Computer Networks 55 (2011) 3634–3647

time alignment of the two signals, etc.). Finally, the twoESG4438C are utilizing their embedded arbitrary wave-form generator to playback in real-time the baseband I/Qwaveforms, up-convert the signal and finally provide theRF output centered at 2.595 GHz.

The Elektrobit Propsim C8 radio channel emulator, al-lows accurate multi-channel emulation of custom or stan-dardized models in the laboratory. This is feasible byadding complex and time-varying effects of multipathand Doppler-shifts in the digital domain (e.g., adjustingthe tap amplitude, delay spread, operation frequency andmobile speed). For the 2 � 2 MIMO scenario consideredin this paper, four uncorrelated multipath fading channelswere created (with different distribution seeds), usingeither the ITU Vehicular A or the ITU Pedestrian B channelmodel [24]. The channel emulator has to be tuned to pro-vide optimal performance in terms of noise floor, dynamicrange and error vector magnitude (EVM), allowing suffi-cient safety margin for its operation. This is necessary toavoid signal-distortion, degradation of the received signalto noise ratio (SNR), saturation of the ADCs and DACs(the signal PAPR has to be accounted for) and quantizationerrors.

At the receiver side, the Mercury Computer SystemsEchotek Series RF 3000 Tuners comprised by 1 DDS moduleand 4 receiver modules, apply phase-coherent down-con-version from an RF signal of 2.6 GHz to an IF frequency of156.8 MHz featuring high spectral purity and dynamicrange. A prototyped SAW filter board is used to optimallymatch the signal bandwidth at IF and thus reduce theout-of-band noise level and eliminate spurious effectsintroduced by the channel emulator. Finally, two broad-band RF noise generators provide extremely flat whitenoise at IF and facilitate the measurement campaign andthe assessment of the system’s performance under variableSNR conditions (e.g., the two noise sources were calibratedand balanced for every 2 dB attenuation step).

The signal is then delivered to the signal acquisition andprocessing development platform equipped with Lyrtech’sADC and FPGA/DSP boards. The VHS-ADC board includes 8

phase synchronous channels with 14-bit ADCs and digi-tally-controlled PGAs for each channel. The board also in-cludes a Xilinx Virtex-4 FPGA that hosts the AGC, theDDC and synchronization processing blocks of the receiver.The baseband signal (20 MHz, 22.4 Msps) is then fed to theSignalMaster Quad board that features two clusters of a Xi-linx Virtex-4 FPGA and two TSM320C6416 DSPs capablehosting the rest of the MIMO mobile WiMAX demodula-tion blocks.

6. Measurement and results

As already mentioned, the MIMO-enabled mobile Wi-MAX receiver was fitted in two Virtex-4 LX160 devices,having the following resource utilization: 81% of slices,93% of RAMB16s and 100% of DSP48 for the first FPGAand 49% of slices, 71% of RAMB16s and 57% of DSP48 forthe second one (i.e., using the Xilinx ISE 9.2). The real-timedebugging of the receiver and data visualization was madeusing the Xilinx ChipScope Pro.

The results presented in this section were acquired afteran extensive measurement campaign and a post-process-ing of the captured data. Matlab scripts and a parser weredeveloped for this reason. The system-performance hasbeen evaluated in terms of the EVM and the raw bit errorrate (BER). Obviously, the inclusion of a channel codingtechnique would significantly improve the system perfor-mance by lowering the final BER values in respect to theones presented herein.4 The real-time channel emulatorwas configured with two MIMO channel models that formpart of the radio conformance tests of WiMAX forum formobility scenarios (i.e., ITU-T vehicular A (60 km/h) and pe-destrian B (3 km/h) channels).

The obtained results allow the quantification of the per-formance losses caused by the hardware deployment of

Page 12: A real-time MIMO-OFDM mobile WiMAX receiver: … real-time MIMO-OFDM mobile WiMAX receiver: Architecture, design and FPGA ... using multiple antennas at both the transmitter and ...

14 16 18 20 22 24−22

−20

−18

−16

−14

−12

−10

mean received SNR per antenna (dB)

mea

n EV

M (d

B)

ITU−T vehicular A (60km/h) ITU−T pedestrian B (3km/h)

14 16 18 20 22 2410−5

10−4

10−3

10−2

10−1

mean received SNR per antenna (dB)

mea

n R

AW

BER

ITU−T vehicular A (60km/h)ITU−T pedestrian B (3km/h)

Fig. 12. Performance of the testbed in mobile channel scenarios.

Table 1FPGA-deployed MIMO baseband power consumption estimation.

Consumption Quiescent (W) Dynamic (W) Total (W)

FPGA-1 1.22 1.31 2.53FPGA-2 1.13 0.62 1.75

5 The power consumed when no signal-switching occurs is defined asquiescent, while the dynamic power consumption represents the accumu-lated power dissipation of the operating components comprising the FPGAdesign.

O. Font-Bach et al. / Computer Networks 55 (2011) 3634–3647 3645

the signal processing algorithms. The co-simulation results(i.e., Matlab versus testbed data) revealed implementationlosses of approximately 3 dBs (e.g., quantization, finite bitrepresentation, etc.), a figure that can be consideredacceptable taking into account that the entire hardwaresetup is also contributing to these losses. In Fig. 11a andb a single static realization for both random channels(i.e., no mobility is emulated) has been used.

Finally, in Fig. 12a and b the same measurements are re-peated for 100 different realizations of the random channel(i.e., using a different channel seed), applying 6 attenuationsteps of the additive white Gaussian noise (AWGN) gener-ators per channel for each of the vehicular A and pedes-trian B models. The curves are produced by averaging the100 data captures for each of the 6 attenuation steps. Thismethod of obtaining the results allows an accurate analysisof the receiver performance under different mobilityscenarios.

As could be expected, the performance of the systemusing the pedestrian B channel model is higher than theone presented in the vehicular A case. This is due to thefact that in the first case the mobile speed is significantlylower, allowing an improved tracking of the channel vari-ations by the AGC algorithm (i.e., the power level of theinput signal is better accommodated to the dynamicrange of the ADCs). As far as the diversity is concerned,the theoretical order is expected to be 4 however, theaverage BER curves obtained in the real implementationtend to have a slope of 2 (i.e., diversity order equal to2). This reduction in the diversity order with respect tothe theoretical calculated case is based mainly on tworeasons. First, the theoretical results generally assumethat the receiver tracks the channel perfectly, which isnot realistic in a real-life implementation. Second, thefloating point logic dominating the proposed MIMO algo-rithms is subject to quantization errors when it is trans-formed to fixed point logic; truncation and finite bitrepresentation was indeed a constraint in our FPGA-basedimplementation. Nonetheless, the obtained results al-

lowed us to demonstrate, validate and measure the actualperformance of a real MIMO system implemented inhardware. Indeed the performance profile is within theexpectable margins and constitutes the prelude of a test-bed that can host real 4G systems.

6.1. Power consumption

The FPGA-based design of the MIMO mobile WiMAX re-ceiver presented in this paper did not follow an energy-efficiency design-path (i.e., this is not part of the scope ofthis paper). However, it is useful to include preliminarypower-consumption metrics to enable a relative assess-ment of our prototype’s power-consumption footprint(power optimizations will be part of our future work). Itis important to take into account that the target FPGA de-vice does not belong to a power-efficient family of Xilinx,while at the same time the power-reduction margin thatcould be achieved by the Xilinx ISE 9.2 tool is very limited.The Xpower software tool of Xilinx has been used to esti-mate the PHY-layer power consumption of the prototypedMIMO mobile WiMAX receiver.

The conducted analysis accounts for both quiescent anddynamic power dissipation.5 Obviously, the design decisionto divide the PHY-layer implementation of the mobile Wi-MAX receiver in two FPGA devices, plays a major role in thepresented power-consumption metrics. The first part,

Page 13: A real-time MIMO-OFDM mobile WiMAX receiver: … real-time MIMO-OFDM mobile WiMAX receiver: Architecture, design and FPGA ... using multiple antennas at both the transmitter and ...

3646 O. Font-Bach et al. / Computer Networks 55 (2011) 3634–3647

namely the digital front-end (from the AGC up to the syn-chronization processing block) is always operating, whilethe second part (from the CP-removal up to the de-mappingprocessing block) is not operating during the silence periodbetween data frames, which can be considered as thesystem’s idle state. As may be observed in Table 1, whenthe system is in idle state, the second part of the design (de-noted as FPGA-2) will only present quiescent power con-sumption, achieving a 14% power consumption savings. Thepresence of multiple clock regions in FPGA-1 increases theoverall power consumption. As expected, the power con-sumption in idle state is quite high due to the inherent highpower-leakage of the FPGA Virtex-4 technology. Mappingour design to an application-specific integrated circuit (ASIC)implementation could dramatically reduce this metric.

7. Conclusions and future work

A designer of real-time MIMO systems is confronted byvarious conventional and other less predictable softwareand hardware issues during modeling, implementationand debugging time. This paper presented the deploymentof the complete PHY-layer of a real-time MIMO mobile Wi-MAX receiver. The overall system validation under realisticchannel mobility emulation was facilitated by using theGEDOMIS� testbed. The presented work covers an in-depthanalysis of the receiver’s architecture, design and FPGAimplementation, especially focusing on the computationaland deployment complexity of this undertaking. The sys-tem-design conceptualization followed a modular ap-proach that allows for more OFDM or MIMO schemes tobe implemented in the future. The contribution of ourwork is mainly found in building and validating a realisticMIMO system based on the mobile WiMAX standard; the20 MHz channel bandwidth and the real-time implemen-tation have scaled the implementation complexity, facili-tating at the same time the experimentation of advancedresearch concepts.

The benefits of different multi-antenna schemes whencompared to single antenna implementations could beinvestigated in the future by applying minor changes tothe currently available implementation (i.e., SIMO andMISO). More results could be obtained in the future withthe inclusion of channel coding processing blocks. Finally,a real-time closed-loop MIMO mobile WiMAX systemcould be compared with our existing open-loop imple-mentation, by deploying an FPGA-based MIMO trans-mitter.

References

[1] S.M. Alamouti, A simple transmit diversity technique for wirelesscommunications, IEEE Journal on Selected Areas in Communications16 (1998) 1451–1458.

[2] K. Pelechrinis, I. Broustis, T. Salonidis, S.V. Krishnamurthy, P.Mohapatra, Design and deployment considerations for highperformance MIMO testbeds, in: Proceedings of the InternationalWireless Internet Conference (WICON).

[3] S. Haene, D. Perels, A. Burg, A real-time 4-stream MIMO-OFDMtransceiver: system design, FPGA implementation, and characteri-zation, IEEE Journal on Selected Areas in Communications 26 (2008)877–889.

[4] R. Chen, Q. Cai, K. Alecke, O. Lazar, T. Kaiser, A real-time PRE-MIMO-LTE software radio testbed, in: Proceedings of the European SignalProcessing Conference (EUSIPCO).

[5] A. Ibing, D. Kühling, M. Kuszak, C.v. Helmolt, V. Jungnickel, Flexibledemonstrator platform for cooperative joint transmission anddetection in next generation wireless MIMO-OFDM networks, in:Proceedings of the IEEE International Conference on Testbeds andResearch Infrastructure for the Development of Networks andCommunities.

[6] V. Jungnickel, M. Schellmann, L. Thiele, T. Wirth, T. Haustein, O. Koch,W. Zirwas, E. Schulz, Interference-aware scheduling in the multiuserMIMO-OFDM downlink, IEEE Communications Magazine 47 (2009)56–66.

[7] K.S. Bialkowski, P. Uthansakul, M.E. Bialkowski, A. Postula, Design ofMIMO testbed with an FPGA board for fast signal processing, in:Proceedings of the International Wireless Internet Conference(WICON).

[8] H.J. Pérez-Iglesias, J.A. García-Naya, A. Dapena, L. Castedo, V. Zarzoso,Blind channel identification in Alamouti coded systems: acomparative study of eigendecomposition methods in indoortransmissions at 2.4 GHz, European Transactions on Telecommuni-cations 19 (2008) 751–759 (special issue: European Wireless 2007).

[9] P. Murphy, A. Sabharwal, B. Aazhang, Design of WARP: a wirelessopen-access research platform, in: Proceedings of the EuropeanSignal Processing Conference (EUSIPCO).

[10] P. Murphy, A. Sabharwal, B. Aazhang, On building a cooperativecommunication system: testbed implementation and first results,EURASIP Journal on Wireless Communications and Networking(2009).

[11] S. Caban, C. Mehlaführer, R. Langwieser, A.L. Scholtz, M. Rupp,Vienna MIMO testbed, EURASIP Journal on Applied Signal Processing(2006).

[12] C. Mehlführer, S. Caban, J.A. García-Nayaz, M. Rupp, Throughput andcapacity of MIMO WiMAX, in: Proceedings of the AsilomarConference on Signals, Systems and Computers.

[13] D. Ramírez, I. Santamaría, J. Pérez, J. Vía, J.A. García-Naya, T.M.Fernández-Caramés, H.J. Pérez-Iglesias, M. González-López, L.Castedo, J.M. Torres-Royo, A Comparative Study of STBCTransmissions at 2.4 GHz Over Indoor Channels Using a 2 + 2MIMO Testbed, Wireless Communications and Mobile Computing,vol. 8, John Wiley and Sons, 2008. pp. 1149–1164.

[14] V.P.G. Jiménez, M.J.F.-G. García, A.G. Armada, R.P. Torres, J.J.G.Fernández, M.P. Sánchez-Fernández, M. Domingo, O. Fernández,MIMO-OFDM Testbed, channel measurements, and systemconsiderations for outdoor-indoor WiMAX, EURASIP Journal onWireless Communications and Networking (2010).

[15] P. Imperatore, E. Salvadori, I. Chlamtac, Path loss measurements at3.5 GHz: a trial test WiMAX based in rural environment, in:Proceedings of the IEEE International Conference on Testbeds andResearch Infrastructure for the Development of Networks andCommunities (TridentCom).

[16] S. Hu, G. Wu, Y.L. Guan, C.L. Law, Y. Yan, S. Li, Development andperformance evaluation of mobile WiMAX testbed, in: Proceedingsof the IEEE Mobile WiMAX Symposium.

[17] S. Mignanti, M. Castellano, M. Spada, P. Simoes, G. Tamea, A.Cimmino, P.M. Neves, I. Marchetti, F. Andreotti, G. Landi, K.Pentikousis, WEIRD testbeds with fixed and mobile WiMAXtechnology for user applications, telemedicine and monitoring ofimpervious areas, in: Proceedings of the IEEE InternationalConference on Testbeds and Research Infrastructure for theDevelopment of Networks and Communities (TridentCom).

[18] M.S. Khairy, M.M. Abdallah, S.E.D. Habib, Efficient FPGAimplementation of MIMO decoder for mobile WiMAX system, in:Proceedings of the IEEE International Conference on Communications(ICC).

[19] Q. Wang, D. Fan, Y.H. Lin, J. Chen, Z. Zhu, Design of BStransceiver for IEEE 802.16E OFDMA mode, in: Proceedings ofthe IEEE International Conference on Acoustics, Speech and SignalProcessing (ICASSP).

[20] Y.J. Wu, J.M. Lin, H.Y. Yu, H.P. Ma, A baseband testbed foruplink mobile MIMO WiMAX communications, in: Proceedingsof the IEEE International Symposium on Circuits and Systems(ISCAS).

[21] L. Zheng, D.N.C. Tse, Diversity and multiplexing: a fundamentaltradeoff in multiple-antenna channels, IEEE Transactions onInformation Theory 49 (2003) 1073–1096.

[22] G. Ganesan, P. Stoica, Space–time block codes: a maximum SNRapproach, IEEE Transactions on Information Theory 47 (2001) 1650–1656.

Page 14: A real-time MIMO-OFDM mobile WiMAX receiver: … real-time MIMO-OFDM mobile WiMAX receiver: Architecture, design and FPGA ... using multiple antennas at both the transmitter and ...

O. Font-Bach et al. / Computer Networks 55 (2011) 3634–3647 3647

[23] V. Tarokh, N. Seshadri, A.R. Calderbank, Space–time codes for highdata rate wireless communication: performance criterion and codeconstruction, IEEE Transactions on Information Theory 44 (1998)744–765.

[24] Guidelines for Evaluation of Radio Transmission Technologies forIMT-2000, Rec. ITU-R M.1225, 1997.

O. Font-Bach (Terrassa, 1981) received hisM.Sc. degree in Computer Engineering fromUniversitat Autònoma de Barcelona (UAB,2004). He also obtained a Master in Design ofIntegrated Circuits from UAB (2006), as part ofa postgraduate internship in the Institut deMicroelectrònica de Barcelona (IMB-CNMCSIC) where he worked in the design auto-mation of AMBA-based NoC systems forFPGA-based implementations. He is currentlya Research Engineer at the Centre Tecnològicde Telecomunicacions de Catalunya (CTTC)

working in the implementation of real-time baseband signal processingalgorithms for high-bandwidth MIMO-OFDM(A) systems. He is alsopursuing his Ph.D thesis at the Department of Signal Theory and Com-

munications of Universitat Politècnica de Catalunya (UPC). His researchinterests focus on baseband digital signal processing for multi-carriercommunication systems.

N. Bartzoudis (Alexandroupoli, Greece 1976)obtained his B.Sc. in Electronic Engineering atthe Technical Educational Institute of Thes-saloniki (Greece, 2000). He then pursuedpostgraduate studies and received his M.Sc.degree (in Digital Communication Systems)and Ph.D. degree (in dependable embeddedsystems) from Loughborough University (UK)in 2001 and 2006 respectively. He has alsoworked as a post-doctoral researcher at theUniversity of Essex (UK). Nikolaos is currentlyholding a Research Associate position at CTTC

(Spain) and he is in charge of the GEDOMIS� testbed. His core workingarea focuses on baseband digital signal processing implementations ofwideband multi-carrier communication systems. Nikolaos has served as

TPC member in various international conferences and he is a regularreviewer for numerous scientific journals.

A. Pascual-Iserte(Barcelona, 1977) receivedthe Electrical Engineering degree and thePh.D. degree from the Technical University ofCatalonia (UPC), Barcelona, in September2000 and February 2005, respectively. He wasawarded with the ‘‘First National Prize of2000/2001 University Education’’ by theSpanish Ministry of Education and Culture,and with the ‘‘Best 2004/2005 PhD ThesisPrize’’ by UPC. In September 2003 he becameAssistant Professor at the Escola d’Enginyeriade Telecomunicació i Aeroespacial de Castel-

ldefels of the UPC, where he teaches currently undergraduate courses inLinear Systems and Digital Signal Processing. He also teaches post-grad-uate courses in Advanced Signal Processing in the Dept. of Signal Theory

and Communications. Since April 2008 he is Associate Professor at UPC.His current research interests include: array processing, robust designs,OFDM, MIMO channels, multiuser access and optimization theory.

D. López Bueno (Granollers, 1978) receivedthe Electrical Engineering degree from Uni-versitat Politècnica de Catalunya (UPC) in2005. From 2003 to 2006 he worked as RFdesigner, first in Mier Comunicaciones andlater in Thales Alenia Space España, partici-pating in several Space projects like Amazo-nas 1, TRMO, Galileo and ICO G1. By the end of2006, he joined CTTC as Research Engineerwhere he works in the RF hardware develop-ment and test and measurement operations ofGEDOMIS testbed (MIMO wireless system

demonstrator). He is also working in interference risk assessmentbetween satellite and terrestrial communication systems, and currentlypursuing a PhD in Electronics Engineering at UPC. His main interests

include the architecture evaluation and design of high-end spectrumsensing and agile RF Transceivers to tackle next generation wirelessapplications.