An Overview of MIMO Systems inWireless Communications

Lecture in “Communication Theory for Wireless Channels”

Sebastien de la Kethulle — September 27, 2004

Future Broadband Wireless Systems

• Desired attributes

– Significant increase in spectral efficiency and data rates

– High Quality–of–Service (QoS) — bit error rate, outage, . . .

– Wide coverage

– Low deployment, maintenance and operation costs

• The wireless channel is very hostile

– Severe fluctuations in signal level (fading)

– Co–channel interference

– Signal power falls off with distance (path loss)

– Scarce available bandwidth

– . . .

The Wireless Channel

• Multipath propagation causes signal fading

MIMO System

Performance Improvements Using MIMO Systems

• Array gain =⇒ increase coverage and QoS

• Diversity gain =⇒ increase coverage and QoS

• Multiplexing gain =⇒ increase spectral efficiency

• Co–channel interference reduction =⇒ increase cellular capacity

Array Gain

• Increase in average received SNR obtained by coherently combiningthe incoming / outgoing signals

• Requires channel knowledge at the transmitter / receiver

Array Gain

λ1, . . . , λm =



´if M < N


´if M ≥ N

y = Hx + n

• H ∈ CM×N (E|Hik|2 = 1). x ∈ CN , y ∈ CM

• n ∈ CM : zero–mean complex Gaussian noise

• Principle: To obtain the full array gain, one should transmit using themaximum eigenmode of the channel

• The singular value decomposition (SVD) H = UDV†, withD = diag(

√λ1, . . . ,

√λm, 0, . . . , 0) and m = min{N,M}, yields

m equivalent SISO channels

y = Dx + n,

where y = U†y, x = V†x and n = U†n (U,V unitary)

Array Gain

y = Dx + n

• If λi = λmax = max{λ1, . . . , λm}, (maximum eigenmode)

yi =√

λmax xi + ni

• Known results

– For N × 1 and 1×M arrays, the array gain (increase in averageSNR) is respectively of 10 log10 N and 10 log10 M dB

– In the asymptotic limit, with M large:

λmax < (√

c + 1)2M c = NM ≥ 1

λmin > (√

c− 1)2M c = NM > 1

• For maximum

– Capacity: waterfilling (later in this presentation)

– Array gain: use only the maximum eigenchannel

Diversity Gain

• Principle: provide the receiver with multiple identical copies of agiven signal to combat fading =⇒ gain in instantaneous SNR

[4]An Overview of MIMO Systems in Wireless Communications 9

Diversity Gain

• Intuitively, the more independently fading, identical copies of agiven signal the receiver is provided with, the faster the bit error rate(BER) decreases as a function of the per signal SNR. At high SNRvalues, it has been shown that

Pe ≈ (Gc · SNR)−d

where d represents the diversity gain and Gc the coding gain

• Definition: For a given transmission rate R, the diversity gain is

d(R) = − limSNR→∞

log(Pe(R,SNR))log SNR

, (1)

where Pe(R,SNR) is the BER at the given rate and SNR

• Independent versus correlated fading

• Diminishing return for each extra signal copy

Diversity Gain

L , d

←− per receive antenna

• The diversity gain is the magnitude of the slope of the BER Pe(R, SNR) plotted

as a function of SNR on a log–log scale

Multiplexing Gain

• Principle: Transmit independent data signals from differentantennas to increase the throughput

Co–Channel Interference

Co–Channel Interference Reduction

• N − 1 interferees can be cancelled with N transmit antennas

• M − 1 interferers can be cancelled with M receive antennas

Capacity of MIMO Systems — The Gaussian Channel

y = Hx + n,


• H ∈ CM×N with uniform phase and Rayleigh magnitude (Rayleighfading environment)—i.i.d. Gaussian, zero–mean, independent realand imaginary parts, variance 1/2

• x ∈ CN , y ∈ CM

• n: zero–mean complex Gaussian noise. Independent and equalvariance real and imaginary parts. E [nn†] = IM

• Transmitter power constraint: E [x†x] = tr(E [xx†]

)≤ P

Circularly Symmetric Random Vectors

Definition: A complex Gaussian random vector x ∈ Cn is said to becircularly symmetric if the corresponding vector

x ∈ R2n =[


]has the structure

E[(x− E [x])(x− E [x])†



[Re(Q) −Im(Q)Im(Q) Re(Q)


for some Hermitian non–negative definite Q ∈ Cn×n

Circularly Symmetric Random Vectors

The pdf of a CSCG random vector x with mean µ and covariance matrixQ is given by

fµ,Q(x) =1

det πQexp

[− (x− µ)†Q−1(x− µ)


and has differential entropy

h(X) = −∫

Cnfµ,Q(x) log fµ,Q(x) dx

= log det πeQ

The Deterministic Gaussian Channel — Capacity

y = Hx + n, E [x†x] ≤ P

Idea: Maximize the mutual information between x and y

I(X;Y) = h(Y)− h(Y|X)

= h(Y)− h(N)

=⇒ Maximize h(Y)

Maximizing h(Y)

It can be shown that:

• If x satisfies E [x†x] ≤ P , then so does x− E [x]

• For all y ∈ CM , h(Y) is maximized if y is Circularly SymmetricComplex Gaussian (CSCG)

• If x ∈ CN is CSCG with covariance Q, then y = Hx + n ∈ CM is alsoCSCG

=⇒ I(X;Y) = log detπe(IM + HQH†)− log det πe

= log det(IM + HQH†)

• A non–negative definite Q such that I(X;Y) is maximum andtr(Q) ≤ P remains to be found

Deterministic Gaussian MIMO Channel

• H known at the transmitter (“waterfilling solution”): Choose Qdiagonal, such that

Qii = (α− λ−1i )+, i = 1, . . . , N

with (·)+ , max(·, 0), (λ1, . . . , λN) the eigenvalues of H†H and α suchthat

∑i Qii = P . The capacity is given by:




)+ [bits/s/Hz]

• H unknown at the transmitter: Choose Q = PN IN (equal power).

Then,CEP = log det(IM + P

NHH†) [bits/s/Hz]

Waterfilling Solution

Rayleigh Fading MIMO Channel

• Memoryless Rayleigh fading Gaussian channel (unknown at thetransmitter)

• Choose x CSCG and Q = PN IN . The ergodic capacity is given by:

CEP = EH[log det(IM + P



= EH[ m∑


log(1 + P



where m = min(N,M) and λ1, . . . , λm are the eigenvalues of theWishart matrix

W ={

HH† M < NH†H M ≥ N

• For large SNR, CEP = min(N,M) log P +O(1), i.e. the capacitygrows linearly with min(N,M)!

Capacity of Fading Channels

• Rayleigh fading: the capacity grows linearly with min(N,M)

• Ricean channels: Increasing the line–of–sight (LOS) strength at fixedSNR reduces the capacity

• If the gains in H become highly correlated, there is a capacity loss

• Waterfilling (WF) capacity gains over Equal Power (EP) capacityare significant at low SNR but converge to zero as the SNR increases

=⇒ Question: Is it beneficial to feed the channel state back to thetransmitter ?

• Many exact capacity results are known for i.i.d. Rayleigh channels.For other channels (Rice, etc.), we have many limiting results

Ergodic Capacity of Ideal MIMO Systems

MT , NMR , M

Channel unknown at the transmitter, i.i.d. Rayleigh fading

Outage Capacity

• The capacity of a fading channel is a random variable

• Definition: The q% outage capacity Cout,q of a fading channel is theinformation rate that is guaranteed for (100− q)% of the channelrealizations, i.e.

P (I(X;Y) ≤ Cout,q) = q%

• Since, for large SNR and i.i.d. Rayleigh fading,

C = min(N,M) log SNR +O(1),

we can define the multiplexing gain r as

r = limSNR→∞



which comes at no extra bandwidth or power

Outage Capacity of Ideal MIMO Systems

MT , NMR , M

Channel unknown at the transmitter, i.i.d. Rayleigh fading

Transmission over MIMO channels

We can use the advantages provided by MIMO channels to:

• Maximize diversity to combat channel fading and decrease the biterror rate (BER) =⇒ space–time codes (STC)

• Maximize the throughput =⇒ spatial multiplexing, V–BLAST (Belllaboratories layered space–time)

• Try to do both at the same time =⇒ trade–off between increasing thethroughput and increasing diversity

Maximizing Diversity with Space–Time Codes

• Space–Time Trellis Codes (STTC) ←− often better performanceat the cost of increased complexity

– Complex decoding (vector version of the Viterbi algorithm) —increases exponentially with the transmission rate

– Full diversity. Coding gain

• Space–Time Block Codes (STBC)

– Simple maximum–likelihood (ML) decoding based on linearprocessing

– Full diversity. Minimal or no coding gain

Alamouti Scheme for Transmit Diversity (STBC)

{r1 = h1c1 + h2c2 + n1 [time t]r2 = −h1c

∗2 + h2c

∗1 + n2 [time t + T ]


r1 = h∗1r1 + h2r∗2 = (|h1|2 + |h2|2)c1 + h∗1n1 + h2n

∗2 −→ c1

r2 = h∗2r1 − h1r∗2 = (|h1|2 + |h2|2)c2 − h1n

∗2 + h∗2n1 −→ c2

• Assumption: the channel remains unchanged over two consecutivesymbols

• Rate = 1 — Diversity order = 2 — Simple decoding

STBC Receiver Structure

STBCs from Complex Orthogonal Designs

• Alamouti’s scheme works only when N = 2 =⇒ Generalization

• Definition: A complex orthogonal design Oc of size N is anorthogonal matrix with entries in the indeterminates±x1,±x2, . . . ,±xN , their conjugates ±x∗1,±x∗2, . . . ,±x∗N or multiplesof these indeterminates by ±


• Example (2× 2): Oc(x1, x2) =(

x1 x2

−x∗2 x∗1

)space −→ time

• Coding scheme (using a constellation A with 2b elements):

1. At time slot t, Nb bits arrive at the encoder. Select constellationsignals c1, . . . , cN

2. Set xi = ci to obtain a matrix C = Oc(c1, . . . , cN)3. At each time slot t = 1, . . . , N , the entries Cti, i = 1, . . . , N are

transmitted simultaneously from transmit antennas 1, 2, . . . , N

STBCs from Complex Orthogonal Designs

• The maximum–likelihood detection rule reduces to simple linearprocessing for STBCs

• One can obtain the maximum possible diversity order MN attransmission rate R = 1 using STBCs based on orthogonal designs

• However: complex orthogonal designs exist only if n = 2. . . !

Generalized Complex Orthogonal Designs (GCOD)

• Definition: Let Gc be a p×N matrix with entries in the indeterminates±x1,±x2, . . . ,±xk, their conjugates ±x∗1,±x∗2, . . . ,±x∗k or multiples ofthese indeterminates by ±

√−1 or 0. If G†cGc = (|x1|2 + · · ·+ |xk|2)I,

then Gc is referred to as a generalized complex orthogonal design of sizeN and rate R = k/p

• Definition: Generalized complex linear processing orthogonal design(GCLPOD) Lc: exactly like above, but the entries can be linearcombinations of x1, . . . , xk and their conjugates

• One can obtain a diversity order of MN at rate R using a STBCbased on a GCOD or a GCLPOD of size N and rate R

Generalized Complex Orthogonal Designs

• Generalized complex linear processing orthogonal designs of rates:

– R = 1 exist for N = 2– R = 3/4 exist for N = 3 and N = 4– R = 1/2 exist for N ≥ 5

• For N ≥ 3, it is not known whether GCLPODs with higher rates exist

• Example (GCLPOD, R = 34, N = 3 and GCOD, R = 1

2, N = 3):

L3c =

x1 x2


−x∗2 x∗1x3√





x∗3√2− x∗3√



G3c =

x1 x2 x3

−x2 x1 −x4

−x3 x4 x1

−x4 −x3 x2

x∗1 x∗2 x∗3−x∗2 x∗1 −x∗4−x∗3 x∗4 x∗1−x∗4 −x∗3 x∗2

Capacity and Space–Time Block Codes

• Space–time block codes

– have extremely low encoder/decoder complexity

– provide full diversity

• However

– For the i.i.d. Rayleigh channel, STBCs result in a capacity loss inthe presence of multiple receive antennas

– STBCs are only optimal with respect to capacity when they haverate R = 1 and there is one receive antenna

Maximizing the Throughput with V–BLAST

Maximizing the Throughput with V–BLAST


• Transmitters operate co–channel, symbol synchronized

• Substreams are exactly independent (no coding across the transmitantennas — each substream can be individually coded)

• Individual transmit powers scaled by 1N so the total power is kept


• Channel estimation burst by burst using a training sequence

• Requires near–independent channel coefficients

Receivers for Spatial Multiplexing

y = Hx + n, i.e.





h11 h12 · · · h1N

h21. . . ...

... . . . ...hM1 · · · · · · hMN








• If we transmit a block of N × T symbols, we have Y = HX + N, with

Y,N ∈ CM×T and X ∈ CN×T

• Optimal (ML) Receiver: x = arg minx

∥∥y −Hx∥∥

– Exhaustive search (often prohibitive complexity)

– Diversity order for each data stream: M (N ≤M)

Receivers for Spatial Multiplexing

y = Hx + n

• Zero–forcing (ZF) Receiver:

x = H#y

with H# = (H†H)−1H† (pseudo–inverse)

– Significantly reduced receiver complexity

– Noise enhancement problem

– Diversity order for each data stream: M −N + 1 (N ≤M)

Receivers for Spatial Multiplexing

y = Hx + n

• Minimum mean–square error (MMSE) Receiver:

x = W · y, where W = arg minWE[∥∥Wy − x


We obtain:

x = H†(HH† + E



· y

– Minimizes the overall error due to noise and mutual interference

– Equivalent to the zero–forcing receiver at high SNR

– Diversity order for each data stream: approximately M −N + 1(N ≤M)

Receivers for Spatial Multiplexing

y = Hx + n, H =[

h1 h2 · · · hN

]• V–BLAST receiver — successive interference cancellation (SIC):

x1 = wT1 y

x1 = Q(x1) (quantization)

y2 = y − x1h1 (interference cancellation)

x2 = wT2 y2, etc.

• The ith ZF–nulling vector wi is defined as the unique minimum–normvector satisfying

wTi hj =

{0 j > i1 j = i,

is orthogonal to the subspace spanned by the contributions to yi dueto the symbols not yet estimated and cancelled and is given by the ithrow of H# = (H†H)−1H† (N ≤M)

Receivers for Spatial Multiplexing

y = Hx + n, H =[

h1 h2 · · · hN

]• V–BLAST receiver

– The SNR of xi is proportional to 1/‖wi‖2– Idea: detect the components xi in order of decreasing SNR =⇒

ordered successive interference cancellation (OSIC)

initialization: G1 = H# Gi =ˆ

g1i g2

i · · · gNi


i = 1

y1 = y

recursion: ki = arg minj /∈{k1,...,ki−1}‚‚gj



wki= gki

iexki= wT


xki= Q(exki


yi+1 = yi − xkihki

Gi+1 = H#


, H with columns k1, · · · , ki set to 0

i = i + 1

Receivers for Spatial Multiplexing

• The V–BLAST SIC receiver:

– Provides a reasonable trade–off between complexity and performance(between MMSE and ML receivers)

– Achieves a diversity order of approximately M −N + 1 per datastream (N ≤M)

• The V–BLAST OSIC receiver:

– Provides a reasonable trade–off between complexity and performance(between MMSE and ML receivers)

– Achieves a diversity order which lies between M −N + 1 and M foreach data stream (N ≤M)

Performance Comparison

←− diversity receiver




Performance Comparison

Linear Dispersion Codes


– is unable to work with fewer receive than transmit antennas

– doesn’t have any built–in spatial coding

• Space–time codes do not perform well at high data rates

• Linear dispersion codes

– include V–BLAST and the orthogonal design STBCs as special cases

– can be used for any number of transmit and receive antennas

– can be decoded with V–BLAST like algorithms

– satisfy an information–theoretic optimality criterion

Linear Dispersion Codes

• A linear dispersion code of rate R = kp b is one for which

X =k∑


(ciCi + c∗iDi), X =




where ci, . . . , ck belong to a constellation A with 2b symbols andCi,Di ∈ Cp×N

Number of transmit antennas: NNumber of receive antennas: M

[15]An Overview of MIMO Systems in Wireless Communications 48

Linear Dispersion Codes

• If Y = XHT +N, it can be shown that: (H ∈ CM×N ; Y,N ∈ Cp×M) y1...


︸ ︷︷ ︸


= H

︸ ︷︷ ︸





,Y =

[y1 · · · yM

]N =

[n1 · · · nM

]where yi ,


], ni ,


], ci ,



H ∈ C2Mp×2k = f(H,C1, . . .Ck,D1, . . .Dk)

• V–BLAST like techniques can thus be used to decode lineardispersion codes

• {C1, . . . ,Ck,D1, . . . ,Dk} are dispersion matrices designed to optimizegiven criteria (e.g. maximum mutual information between η and ξ)

Diversity vs. Multiplexing Trade–off

C = min{N,M} log SNR +O(1)

• Definition: A scheme {C(SNR)} is a family of codes of block lengthl, one for each SNR level. R(SNR) [b/symbol] denotes the rate of thecode C(SNR)

• Definition: A scheme {C(SNR)} is said to achieve spatialmultiplexing gain r and diversity gain d if the data rate



= r

and the average error probability


log Pe(SNR)log SNR

= −d (2)

Diversity vs. Multiplexing Trade–off

• For each r, d∗(r) is the supremum of the diversity gains achievedover all schemes

• We also define:

– d∗max , d∗(0), the maximal diversity gain

– r∗max , sup{r|d∗(r) > 0}, the maximal spatial multiplexing gain

• Theorem: Assume l ≥ N + M − 1. The optimal trade–off curved∗(r) is given by the piecewise–linear function connecting the points(k, d∗(k)), k = 0, 1, . . . ,min{N,M}, where

d∗(k) = (N − k)(M − k).

In particular, d∗max = NM and r∗max = min{N,M}.

Diversity vs. Multiplexing: Optimal Trade–off

m , Nn , M

Diversity vs. Multiplexing Trade–off: V–BLAST

n , N = M

Diversity vs. Multiplexing Trade–off: Alamouti Scheme

m , Nn , M

Diversity vs. Multiplexing Trade–off: Alamouti Scheme

m , Nn , M

Diversity vs. Multiplexing Trade–off

• Definitions (1) and (2) for the diversity gain are not equivalent: inthe former one, a fixed data rate is assumed for all SNRs, whereas inthe latter one, the data rate is a fraction of C(SNR), and henceincreases with the SNR

• Definition (1) is the most widely used in the literature

• Definition (2) allows to quantify the diversity vs. multiplexingtrade–off

MIMO Channel Modeling

• A good MIMO channel model must include:

– Path loss

– Shadowing

– Doppler and delay spread profiles

– Ricean K factor distribution

– Joint antenna correlation at transmit and receive ends

– Channel matrix singular value distribution

Ricean K factor distribution


• The higher the Ricean K factor, the more dominant HLOS


• HLOS is a time–invariant, often low rank matrix =⇒ high K factorchannels often exhibit a low capacity

• In a near–LOS link, the improvement in link budget often more thancompensates for the loss of MIMO capacity =⇒ usually, the LOScomponent is not intentionally reduced

• Experimental measurements show that, in general:

– K increases with antenna height– K decreases with transmitter–receiver distance =⇒ MIMO

substantially increases throughput in areas far away from the basestation

Correlation Model for HNLOS

“One–ring” model

• Base Station (BS) usually elevated and unobstructed by local scatterers

• Subscriber Unit (SU) often surrounded by local scatterers — assumedhere uniformly distributed in θ

TAl : lth transmitting antenna element Θ : angle of arrivalRAl : lth receiving antenna element ∆ : angle spreadS(θ) : scatterer located at angle θ

[16]An Overview of MIMO Systems in Wireless Communications 59

Correlation Model for HNLOS

• Correlation from one BS antenna element to two SU antenna elements:

E [Hl,pH∗m,p] ≈ J0




• Correlation from two BS antenna elements to one SU antenna elementin the broadside direction (Θ = 0):

E [Hm,pH∗m,q] ≈ J0


λd(p, q)


• Correlation from two BS antenna elements to one SU antenna elementin the inline direction (Θ = π


E [Hm,pH∗m,q] ≈ e−j2πλ d(p,q)



)· J0



λd(p, q)


↑distance between antennas l and m

↑distance between antennas p and q

Correlation Model for HNLOS

←− J0(x)

• The mobiles have to be in the broadside direction to obtain the highestdiversity

• Interelement spacing has to be high to have low correlation =⇒beamforming and MIMO yield conflicting criteria

• Using the above results, one can obtain upper bounds for the MIMOcapacity

Decoupling Between Rank and Correlation

Pinhole channel

• Uncorrelated fading at both ends doesn’t necessarily imply ahigh–rank channel

MIMO Channel Modeling

• Time–varying wideband MIMO channel:

H(τ) =L∑


Hiδ(τ − τi)

where H(τ) ∈ CM×N and only H1 contains a LOS component

• Typical interelement spacing:

– Base station: 10λ (due to the absence of local scatterers)

– Subscriber unit: 12λ (rich scattering)

SISO OFDM Transmitter SISO OFDM Receiver

N , K, l = OFDM symbol number N , K

• Net result: The frequency selective fading channel of bandwidth B isdecomposed into K parallel frequency-flat fading channels, eachhaving bandwidth B

K . (Condition: The impulse response of thechannel is shorter than the length of the cyclic prefix)

• OFDM can be extended to MIMO systems by performing theIDFT/DFT and CP operations at each of the transmit and receiveantennas (with the appropriate condition on the length of the cyclicprefix)

• Diversity systems: (Ex: Alamouti scheme)

– Send c1 and c2 over OFDM tone i over antennas 1 and 2

– Send −c∗2 and c∗1 over OFDM tone i + 1 over antennas 1 and 2within the same OFDM symbol

– Alternative technique: Code on a per–tone basis across OFDMsymbols in time

• Spatial multiplexing: Maximize spatial rate (r = min{N,M}) bytransmitting independent data streams over different antennas =⇒spatial multiplexing over each tone

• Space–frequency coded MIMO–OFDM

– OFDM tones with spacing larger than the coherence bandwidthBC experience independent fading

– If Deff = BBC

, the total diversity gain that can be realized is ofNMDeff

Throughput in MIMO Cellular Systems

• MIMO channels offer multiplexing gain, diversity gain, array gainand a co–channel interference cancellation gain

• Careful balancing between those gains is required

• MIMO systems offer a promising solution for future generationwireless networks

• Ongoing research

– Space–time coding (orthogonal designs, etc.)

– Receiver design (ML receiver is too complex)

– Channel modeling

– Capacity of non–ideal MIMO channels

– . . .

