Digital Signal Processing · 2018-03-26 · processing

Digital Signal Processing 41 (2015) 60–69

Contents lists available at ScienceDirect

Digital Signal Processing

www.elsevier.com/locate/dsp

A geometry based efficient decoder for underdetermined MIMO

systems

Chung-Jung Huang, Chang-Shen Lee, Wei-Ho Chung ∗, Ta-Sung Lee

a r t i c l e i n f o a b s t r a c t

Article history:Available online 25 March 2015

Keywords:Maximum-likelihood (ML) detectionSphere decodingTree searchOrdering

In this paper, a geometry based decoder with low decoding complexity and exact maximum-likelihood (ML) performance is proposed for underdetermined multiple-input multiple-output (MIMO) systems. In the proposed decoder, an underdetermined MIMO system can be divided into a multiple-input single-output (MISO) sub-system and a regular MIMO sub-system in which numbers of transmit and receive antennas are equal. An efficient slab search algorithm (ESSA) is adopted to efficiently obtain valid candidate points in the MISO sub-system. By adopting ESSA in the MISO sub-system and sphere decoding algorithm (SDA) in the MIMO sub-system, ML solution of underdetermined MIMO system can be obtained with low computational complexity. To further reduce the computational complexity, a near-ML SDA is proposed to more efficiently find the candidate points in the MIMO sub-system. In addition, an optimal preprocessing technique is proposed from the geometrical perspective and the comprehensive analysis on the complexity reduction is also provided. Simulation results indicate that the proposed approach significantly reduces the complexity as compared to existing ML decoders, particularly for systems with large number of antennas and/or high-order constellations.

© 2015 Elsevier Inc. All rights reserved.

1. Introduction

With the increasing interest in multiple-input multiple-output (MIMO) systems, the MIMO detection is crucial. The MIMO de-tections can be categorized into the single-user detection and co-operative detection. In [1], the authors studied the transmissions in broadcasting with heterogeneous users, and proposed several cooperation schemes among the users to achieve high transmis-sion rates. For single-user detection, the well-known detection schemes include linear detection, successive interference cancel-lation (SIC) [2], and maximum-likelihood (ML) detection. Linear detection and SIC are low-complexity sub-optimal detectors; on the contrary, the ordinary ML detector is the optimal detector, with complexity growing exponentially with the number of transmit an-tennas. To remedy this, the sphere decoding algorithm (SDA) has been investigated to achieve the ML performance with reduced complexity [3,4]. The complexity of SDA can be further reduced through proper preprocessing schemes [5]. In [5], the authors pro-posed several criteria to determine the decoding order of each layer, which reduces the number of candidate points and thus reduces the computational complexity of the SDA. However, SDA

* Corresponding author.E-mail addresses: [email protected] (C.-J. Huang),

[email protected] (C.-S. Lee), [email protected] (W.-H. Chung), [email protected] (T.-S. Lee).

http://dx.doi.org/10.1016/j.dsp.2015.03.0051051-2004/© 2015 Elsevier Inc. All rights reserved.

fails in the underdetermined MIMO systems, because the upper-triangular matrix generated from Cholesky or QR decomposition of the rank-deficient channel matrix has zeros as diagonal elements. Underdetermined MIMO systems can be found in certain settings, e.g., MIMO systems with the number of receive antennas smaller than that of transmit antennas. The above configuration can be found in the downlink transmission of Third Generation Partner-ship Project (3GPP) Long Term Evolution (LTE).

To overcome the aforementioned drawback of SDA, certain novel decoders have been proposed. First, the generalized sphere decoder (GSD) [6] performs an exhaustive search on specified di-mensions to find the ML solution. Its decoding complexity in-creases with the constellation size and the difference between transmit–receive antenna numbers. Based on GSD, other efficient decoders have been proposed, such as the regularized sphere de-coder [7,8], tree-search approach [9,10], double-layer sphere de-coder (DLSD) [11,12], and slab sphere decoder (SSD) [13,14]. In [7], the authors use regularization techniques to make the channel matrix full-rank, and then apply the conventional SDA to obtain an ML solution. Later, the works in [8] reformulate this approach and improve the complexity for generalized Q-quadrature ampli-tude modulation (QAM) systems. In [9], the authors propose an efficient tree-search decoding algorithm for binary constellation systems and extend this algorithm to phase shift keying (PSK) sys-tems with arbitrary constellation size in [10]. In this modified al-gorithm, constellations of Q-QAM systems need to be decomposed

http://dx.doi.org/10.1016/j.dsp.2015.03.005

http://www.ScienceDirect.com/

http://www.elsevier.com/locate/dsp

mailto:[email protected]




http://dx.doi.org/10.1016/j.dsp.2015.03.005

http://crossmark.crossref.org/dialog/?doi=10.1016/j.dsp.2015.03.005&domain=pdf

C.-J. Huang et al. / Digital Signal Processing 41 (2015) 60–69 61

into a weighted sum of QPSK constellations. As a result, the decod-ing complexity increases rapidly with the size of transmit–receive antenna number difference and/or constellation. The DLSD in [11]utilizes an outer sphere decoder to find a valid candidate set, and an inner sphere decoder to find the ML solution. The SSD in [14]adopts a geometrical approach for finding the valid candidate set to further reduce the search complexity. Comparing the results in [6–14], the SSD in [14] is the ML decoder which exhibits the low-est complexity for the large constellation, and is thus chosen as a benchmark. Unfortunately, as shown in [14], the decoding com-plexity increases rapidly with the size of transmit–receive antenna number difference and/or constellation.

On the other hand, there are several algorithms featuring fixed complexity or low complexity for underdetermined MIMO sys-tems, such as robust fixed-complexity SDA [15], algorithms for coded MIMO systems [16,17], hybrid approach [18] and heuristic search method [19]. Although [15–19] can achieve lower complex-ity than [6–14] in certain scenarios, they cannot guarantee ML per-formance. Therefore, developing efficient ML decoding algorithm is still an active research field for practical applications.

Preprocessing at receiving end is another technique to reduce decoding complexity for all decoders and/or improve decoding per-formance for sub-optimal decoders. The common preprocessing techniques include column permutation [12], scaling [20] and lat-tice reduction [3,21]. Generally, preprocessing techniques can be applied to any decoder to enhance its performance and reduce its complexity; however, preprocessing itself also needs computational complexity. Therefore, proper preprocessing for a specific decoder needs a sophisticated design.

In [14], the underdetermined MIMO detection problem is con-verted into a multiple-input single-output (MISO) and a symmetric (i.e., equal number of transmit and receive antennas) MIMO detec-tion problem. For example, an (5,4) underdetermined MIMO detec-tion problem can be viewed as a (2,1) MISO and a (3,3) symmetric MIMO detection problem. For the MISO problem, the authors pro-pose an efficient search method to find the candidate points from the geometrical perspective. The candidate points obtained from the MISO problem are then set as the candidate points in the sym-metric MIMO problem, and SDA can be performed to find the ML solution. In summary, the reduced complexity in [14] compared to existing approaches is achieved by the efficient search method in the MISO problem.

Motivated by [14], this paper proposes a geometry-based ef-ficient decoder for underdetermined MIMO systems. An efficient slab search algorithm (ESSA) [22] is adopted to efficiently obtain valid candidate points within a given slab. The proposed decoder can thus provide ML performance with lower complexity com-pared to SSD. To further reduce the complexity, a near-ML SDA is proposed to find the candidate points more efficiently than tra-ditional SDA. In addition, we propose an optimal preprocessing technique from the geometrical perspective and conduct compre-hensive analysis on the complexity reduction. By introducing the proposed preprocessing scheme, the incorporated decoder can sig-nificantly reduce the decoding complexity in the low SNR regime without sacrificing performance.

The remaining part of the paper is organized as follows. Sec-tion 2 describes the signal model and SSD algorithm. Section 3introduces the proposed decoding algorithms. The proposed pre-processing techniques and analysis of complexity reduction are presented in Section 4. Section 5 presents the simulation results to demonstrate the advantages and confirms the analytical results of the preprocessing scheme. Finally, Section 6 concludes the pa-per.

Throughout the paper, we denote the sets of real, complex, and integer numbers by R, C and Z, respectively. The vectors and ma-trices are denoted by lower-case and upper-case boldface letters,

respectively, with IN representing the N × N identity matrix. [·]H

denotes the conjugate transpose operation.

2. Signal model and underdetermined SDA

Consider a symbol synchronized MIMO system with Nt trans-mit antennas and Nr receive antennas. The received and trans-mit signal vector are denoted as y = [ y1 y2 · · · yNr ]T ∈ C

Nr×1

and x = [x1 x2 · · · xNt ]T ∈ ZNt [ j], respectively. In the above ex-

pressions, ym and xn are the received signal at the mth receive antenna and the transmitted signal at the nth transmit antenna, respectively. Quadrature amplitude modulation with Q constella-tions (Q -QAM) is adopted in our system. Let H denote the Nr × Nt

channel matrix whose entry hi, j is the channel gain from the jth transmit antenna to the ith receive antenna. Assume that the chan-nel is frequency-flat fading and remains constant over a frame du-ration, and hi, j ∼ CN(0, 1). The relationship between the received signal and transmitted signal can be expressed as

y = Hx + n, (1)

where n = [n1 n2 · · · nNr ]T ∈ CNr×1 is the noise vector n ∼

CN(0, σ 2n INr ).

A commonly practiced alternative to the complex MIMO detec-tions is to perform performing real-value decomposition (RVD) on the complex signal model:

y =[

Re(y)

Im(y)

], x =

[Re(x)

Im(x)

], n =

[Re(n)

Im(n)

],

H =[

Re(H) − Im(H)

Im(H) Re(H)

], (2)

which yields

y =N∑

i=1

hi xi + n = Hx + n, (3)

where hi is the ith column vector of channel matrix H, x ∈Λ2Nt×1 ⊂ Z

N×1, y ∈ RM×1, n ∈ R

M×1, H ∈ RM×N , N = 2Nt and

M = 2Nr . Note that Λ = {±1, · · · , ±(√

Q − 1)} for Q -QAM sys-tems.

Remarks

(i) After the RVD operation, the received signal at ith received an-tenna yi can be expressed as yi = hi1x1 +hi2x2 +· · ·+hiN xN +ni . The corresponding probability density function (pdf) can be approximated as N(0, σ 2

n + N Es/4).(ii) The set {h1, h2, . . . , hN } is still an i.i.d. Gaussian Random vec-

tor set, i.e., hi ∼ N(0, 0.5IM).(iii) The received signal vector y, noise vector n and {h1, h2,

. . . , hN } are mutually independent.

The ML detector searches all possible combinations of transmit-ted symbols via the following criterion [4]:

x = arg minx∈S

‖y − Hx‖2, (4)

where S denotes the set of all possible transmitted symbol vec-tors of size (

√Q )N . The computational complexity of ML detection

grows exponentially with N . Therefore, it is difficult to be imple-mented at the receiver in practice.

For MIMO systems with M ≥ N , SDA has been proposed to achieve the ML performance with lower complexity [4]. The ba-sic idea of SDA is to restrict the search region within the interior of a hyper-sphere of radius C centered on the received signal vec-tor y:

62 C.-J. Huang et al. / Digital Signal Processing 41 (2015) 60–69

‖y − Hx‖2 ≤ C2. (5)

For underdetermined MIMO systems with M < N , the SSD algo-rithm proposed in [14] performs an efficient search for (5). The SSD algorithm first performs QR-decomposition of H, leading to∥∥y′ − Rx

∥∥2 ≤ C2, (6)

where y′ = QT y. With M < N , we have

−C ≤ y′M − (rM,M xM + · · · + rM,N xN) ≤ C (7)

at the Mth layer. In (7), a detection in an (N − M + 1)-dimensional subspace is involved, which is similar to a real-valued MISO prob-lem [14]. The SSD algorithm then employs a 2-stage decoder, consisting of the plane decoding algorithm (PDA) and slab decod-ing algorithm (SLA), to obtain the constellation points (defined byHx) falling inside the slab described by (7) via (N − M + 1) one-dimensional searches. These points form the candidate point set, and each point in the set can be substituted into (6) to obtain

‖yG − R1xG‖2 ≤ C2 −[

y′M −

N∑j=M

rM jx j

]2

, (8)

where xG = [x1, x2, · · · , xM−1] ∈ ZM−1, yG = [y1, y2, · · · , yM−1] ∈

RM−1, and R1 ∈ R

M−1×M−1 consists of the first (M − 1) columns and rows of R. Since R1 is a full rank upper triangular matrix, SDA can be adopted to find the ML solution for each given candidate point. Finally, the candidate point yielding the smallest Euclidean distance in (6) is chosen as the solution.

Although the SSD algorithm achieves lower complexity than ex-isting decoders, it has certain disadvantages. First, PDA and SLA are independently and sequentially executed, leading to multiple eval-uated 1-D searches. In addition, the execution of SDA incurs a high computational load when the number of candidate points and/or M is large.

3. Proposed decoding algorithms for underdetermined MIMO systems

In this section, a new ML decoder is developed which can achieve lower complexity than existing ML decoding algorithms for underdetermined MIMO systems. The decoder consists of two geometry based algorithms for finding the candidate points and performing decoding, respectively.

3.1. An efficient slab search (ESS) algorithm

The proposed search method is similar in principle to PDA and SLA, but requires the execution of only a single algorithm, such that multiple 1-D searches or candidate point searches can be avoided.

From (7), two boundary equations can be formulated. Along each 1-D search, two intersection points (one for the upper bound and the other for the lower bound) which satisfy the boundary equations can be obtained. The candidate points can then be de-termined to include those points falling inside the slab. Consider a generalized slab described by the equation | ∑K

i=1 wi xi − y| ≤ C . The proposed efficient slab searching (ESS) algorithm is summa-rized as a flow chart in Fig. 1.

3.2. Sphere decoding algorithm (SDA)

After performing ESS algorithm, a candidate set satisfying (7)can be efficiently obtained. Hence, (5) can be rewritten as

M∑i=1

[y′

i −N∑

j=i

ri j x j

]2

≤ C2. (9)

The associated Euclidean distance of each candidate point obtained from layer i is denoted as disti . From (9), candidate xM−1, . . . , x1 can be obtained sequentially from previous layer. Specifically, we can derive upper bound xM−1,up , lower bound xM−1,low , and distance distM−1 of layer (M − 1) as follows.

xM−1,up =√

C2 − (distM)2 + (y′M−1 −∑N

j=M rM−1, jx j)

rM−1,M−1,

xM−1,low = −√C2 − (distM)2 + (y′M−1 −∑N

j=M rM−1, jx j)

rM−1,M−1,

dist2M−1 =

(y′

M−1 −N∑

j=M−1

rM−1, jx j

)2

+ (distM)2.

Note that the two bounds define the region in which xM−1 lies. Similar procedure can be applied to find candidate points in the rest layer.

3.3. Low-complexity SIC-based radius shrinking SDA (SIC-RSSDA)

To further reduce the decoding complexity, a SIC-based radius shrinking SDA (SIC-RSSDA) is proposed. The minimal Euclidean dis-tance associated with zero forcing-SIC (ZF-SIC) solution among the updated candidate set will be used as the new radius. The proce-dure is executed successively until layer 1 to obtain the candidate points. Finally, the point with the smallest Euclidean distance dist1is the solution. The detail procedure of SIC-RSSDA is listed as fol-lows:

SIC-RSSD AlgorithmInitialization:

• Define a working variable state which records the in-ternal state about radius shrinking. Initialize state with ‘RADIUS_INC’. Set i = M .

Step 1: Perform SDA (9). A (N − i + 2-dimensional) point is cho-sen from the updated candidate set with the minimum distance disti to obtain the corresponding ZF-SIC solution. The Euclidean distance in (4) associated with the ZF-SIC solution is then calculated as Cnew.

Step 2: Adjust the search radius using the internal variable sta-tus and obtain Cnew from Step 1 according to the two rules: (i). if state = ‘RADIUS_DEC’ then {Cnew replaces C if Cnew < C ; otherwise C remains unchanged.} (ii). if state = ‘RADIUS_INC’ then {Cnew replaces C and change the state variable to ‘RADIUS_DEC’ if Cnew ≤ C ; otherwise increase radius C according to the rule in [4].}

Step 3: When the point in the intersection with the smallest Eu-clidean distance dist1 is obtained, or the candidate set becomes empty, terminate the procedure; otherwise set i = i − 1 and go to Step 1.

Note: If the candidate set at a certain stage becomes empty, then ZF-SIC is regarded as the solution.

4. Proposed preprocessing technique for complexity reduction

From the Section 3, the decoding complexity of the two-stage decoding algorithm depends on the number of candidate points Np of the initial candidate set. In order to reduce the decoding complexity, Np needs to be as small as possible. To ensure this, we propose an efficient preprocessing scheme to further reduce


Fig. 1. Proposed efficient slab search (ESS) algorithm.

the decoding complexity and give a lower bound analysis of the proposed scheme.

4.1. A preprocessing with column permutation

Performing QR decomposition, the rotated received signal vec-tor y′ can be represented as

y′ = [y′1, y′

2, · · · , y′M

]T = [qT1 y,qT

2 y, · · · ,qTM y]T ∈R

M , (10)

where Q = [q1, q2, . . . , qM ] ∈RM×M . Generally, |y′

M | represents the distance from the origin to the slab, as shown in Fig. 2. In Fig. 2, we assume that the selected slab equation is h1x1 + h2x2 = y, and we define |y′

M | as the location index of slab ξ . It is obvious that Np is smaller when ξ becomes larger. Therefore, we can choose the slab to maximize ξ for reducing the decoding complexity.

Since different reordering rules of channel columns before QR decomposition will result in different qM . The optimization prob-lem can be reformulated as finding the appropriate qM such that the value of |y′

M | is maximal. The channel column reordering is formulated as HP, where P is a permutation matrix. It is well known that performing an exhaustive search for all permutation cases can obtain the optimal ordering, but the computational com-plexity is large. In this subsection, two low-complexity ordering

algorithms are proposed. The algorithms are developed according to the following two lemmas.

Lemma 1. Given the QR decomposition of the ordered channel matrix Ho = QoRo , the value of |y′

M | only depends on the received signal vector y and permutation of first (M − 1) column vectors of ordered channel matrix Ho.

Proof. The channel matrix can be represented as H = [H1, H2], where H1 ∈ R

M×M and H2 ∈ RM×(N−M) . Performing the QR de-

composition to matrix H1 yields H1 = Q1R1. Furthermore, the channel matrix can be represented as H = Q1[R1, QT

1 H2]. Therefore, the y′

M can be obtained by qTM y. Since Q1 is a unitary matrix, qM

can be uniquely decided when all qi are available and all diagonal elements of R1 are constrained to be positive values. �Lemma 2. The problem of maximizing |y′

M | is equivalent to finding a unitary matrix Q1 such that the summation of the correlations of the first (M − 1) bases of Q1 with respect to the received vector y is minimized.

Proof. Because ‖y′‖2 = ‖QT1 y‖2 = ‖y‖2, the squared norm of y′ can

be expressed as ‖y′‖2 =∑Mi=1(y′

i)2 =∑M−1

i=1 (y′i)

2 + |y′M |2; there-

fore, the problem of maximizing |y′M | is equivalent to minimizing ∑M−1

(y′)2 and then the original problem can be reformulated as
i=1 i


Fig. 2. Geometrical diagram of slabs with different y.

minimizing ∑M−1

i=1 (yT qi)2. In other words, the optimization prob-

lem is equivalent to finding an unitary matrix Q1 such that the summation of the correlations of the first (M −1) bases of Q1 with respect to the received vector y is minimized. �

According to the aforementioned two derived lemmas, we pro-pose two special ordering rules, namely, the projection ordering rule and greedy ordering rule. The QR decompositions with the two rules are referred to as Projection QR and Greedy QR decom-positions, respectively.

Projection QR decompositionStep 1: Calculate the projected norm {‖〈y, hi〉‖}N

i=1, where hi is the ith column of H.

Step 2: Find the permutation matrix P such that the projected norm of the permutated matrix HP from the left to right, are in ascending order.

Step 3: Apply the standard QR decomposition to the permuted channel matrix HP, yields Ho = QoRo .

Greedy QR decompositionStep 1: Initialization of the vector set V = {hi , i = 1, 2, . . . , N}

and loop index k = 1.Step 2: If k = M , go to Step 4; otherwise, the lowest correla-

tion between y and hi from V is chosen according to the criterion j = arg min1≤i≤(N−k+1){〈y, hi/‖hi‖〉}. After-wards, the chosen vector should be normalized to be qk

and discard the selected h j from V.Step 3: V is then updated by hi = hi− < hi · qk > qk, ∀i; then k

is set to k + 1, and go to Step 2.Step 4: A column vector is randomly chosen form V and is

normalized to be qM . Finally, r jk can be computed by r jk = q j · hk to obtain the QR decomposition of the re-ordered H.

In summary, the projection norm ordering rule permutes the column vector of channel matrix according to absolute value of projected amount between received signal vector y and each col-umn vector of channel matrix h j in ascending order. The result suffers from the interference caused by other non-orthogonal col-umn vectors. In contrast, the greedy ordering rule always attempts to eliminate these interferences at each searching loop, so that the correlation between y and kth column vector of Q matrix can be as small as possible.

4.2. Complexity analysis

To analyze the computational complexity of our proposed pre-processing techniques, the statistical property of the number of candidate points obtained in each channel and noise realization is needed. Because the number of candidate points obtained is de-termined by the slab equation, we need to analyze the statistical property of the slab equation, i.e., we need to obtain the analytical form of the distribution of y′

M . To our best knowledge, the ana-lytical form of the distribution of y′

M with greedy ordering rule is difficult to obtain, due to the dynamic column selection dur-ing iterative processing. Instead, a lower bound is analyzed here. Recalling the projection ordering rule, the channel matrix is re-ordered according to absolute value of each projection; therefore, we first define N random variables zi , 1 ≤ i ≤ N:

zi = ∣∣yT hi∣∣=∣∣∣∣∣

M∑k=1

ykhki

∣∣∣∣∣ 1 ≤ i ≤ N, (11)

and then derive the corresponding pdf as

f zi (z) =

⎧⎪⎨⎪⎩

2√2πσ 2

z

e− z2

2σ2z z ≥ 0

0 z < 0

, (12)

where σ 2z = M

√12 ( N Es

4 + σ 2n ).

The distribution of y′M depends on the distribution of diago-

nal elements of matrix R, and also on the order of channel ma-trix. Therefore, we need to characterize the distribution of ordered channel vectors. For analyzing the distribution, the two random variables zi defined in (11) and h1i are paired and denoted as (zi, h1i), where h1i is the first entry of the column vector hi . Based on remark (ii) and (iii) mentioned in Section 2, it can be easily shown that (zi, h1i)

Ni=1 are i.i.d. random variable pairs. Hence, the

joint density functions of these paired random variables can be ex-pressed as

f zi ,h1i (z,h) =⎧⎨⎩ 2√

2πσ 2k

e− z2

2σ2k 2√

2πe−h2

z ≥ 0

0 z < 0

, (13)

where σ 2k = σ 2

zM (h2 2σ 2

zM + (M − 1)). Furthermore, we arrange these

zi(1 ≤ i ≤ N) in ascending order as z1:N ≤ z2:N ≤ · · · ≤ zN:N ; then the h1i(1 ≤ i ≤ N) paired with these order statistics are denoted by h[1:N], h[2:N], · · · , h[N:N] . Since (zi, h1i)

Ni=1 are i.i.d. random vari-

ables, the conditional pdf of h[i:N] given zi:N = z can be obtained by


fh[i:N] (h|zi:N = z) = f (h|z). Hence, the statistical property of each ordered column vector h[i:N] can be obtained by f zi:N h[i:N] (z, h) =f (h|z) f zi:N (z) and expressed as

fh[i:N](h) =∞∫

−∞f(h|z) f zi:N (z)dz =

∞∫−∞

f zihi (z,h)

f zi (z)f zi:N (z)dz

= 2IN,i(h)N!πσ 2

k (i − 1)!(N − i)!e− h22 . (14)

Furthermore, by performing the QR decomposition of the or-dered channel matrix Ho = QoRo according to the proposed pro-jection ordering rule, we can characterize the cdf of square of the Mth the diagonal entry of Ro , denoted by r2

o,M,M , as

Fr2o,M,M

(r) =1∫

0

{ r/s∫0

f wo,M ,sM (w, s)dw

}ds

=1∫

0

r/s∫0

f wo,M (w) f Ss (s)dwds

=1∫

0

r/s∫0

{ ∞∫−∞

1

|w| f w(w) f g

(z

w

)[F z(z)

]M−1

× [1 − F z(z)]N−M

dz

}

× C

(s− 1

2 (1 − s)(M−3)

2

(M − 2

2

)!)

dwds, (15)

where C = 1β(M,N−M+1)·β( M−1

2 , 12 )

. The detailed derivation of (15)

is shown in Appendix A. Hence, according to (7), y′M can be ex-

pressed as

y′M = ro,MM xo,M +

N∑i=M+1

ro,M,i xo,i + nM , (16)

where xo,i is the corresponding transmitted symbol to the ordered channel vector h[i:N] . In fact, xo,i is a random variable of discrete uniform distribution and independent to channel ordering.

Lemma 3. The random variables rM,i and h[i:N] have the same pdfs for i = M + 1, . . . , N.

Proof. The ordered channel matrix can be represented as Ho =[Ho,1Ho,2], Performing the QR decomposition to matrix Ho,1yields Ho,1 = Qo,1Ro,1. Furthermore, the channel matrix can be represented as Ho = Qo,1[Ro,1 QT

o,1Ho,2] = QoRo , where Ho,2 =|h[M+1:N], · · · , h[N:N]|. The elements ri, j of the matrix Ro can be ob-

tained by ri, j = qTi h[M+1:N] for i = 1, . . . , M , and j = M + 1, . . . , N .

It is well known that the pdf of a Gaussian random vector is invariant under the orthogonal transformation by Qo . Therefore, the random variables rM,i and h[i:N] have the same pdfs for i = M + 1, . . . , N . �

According to the Lemma 3 and substituting (14) and (15)into (16), the pdf of y′

M can be obtained by numerical approach. Recalling the motivation of the proposed ordering scheme, we ap-propriately choose a specific slab for reducing the decoding com-plexity because the average decoding complexity can be expressed in terms of Np , which is proportional to the intersectional volume of the constellation space and the specific (Mth) slab for a given

radius C . Np can be obtained by the approximated formulation proposed in [13] as

Np ={

(2|Q |N−1 − 1)[1 − (yR )2](N−1)/2 + 1 if|y| ≤ C;

1 otherwise.(17)

Hence, the expected value of Np can be expressed as

E[Np(y)

]=∞∫

C

f y(y)dy +−C∫

−∞f y(y)dy

+C∫

−C

{(2|Q |K−1 − 1

)[1 −

(y

R

)2](K−1)/2

+ 1

}

× f y(y)dy, (18)

where f y(y) denotes the pdf of y according to the slab equation. Therefore, we can evaluate the expected value of Np obtained by ESS algorithm with and without projection ordering rule by substi-tuting the pdf of y′

M and yM into (18), respectively. Furthermore, the complexity reduction ratio γ of the projection ordering rule is defined as

γ = Np, without_ordering − N P ,with_ordering

N P , without_ordering

= 1 − E[No,p(y′M)]

E[Np(yM)] . (19)

By the above procedure and the analytical results of (16), the av-erage complexity reduction ratio can also be obtained by the nu-merical method instead of the time-consuming Monte-Carlo trials.

In this section, based on the searching philosophy of the effi-cient decoder proposed in previous Section 3, we propose a pre-processing scheme to reduce the decoding complexity from the geometrical perspective. The sub-optimal ordering rule is also ana-lyzed; furthermore, the complexity reduction ratio can be obtained in an analytical form. The statistical properties of diagonal ele-ments of R matrix are crucial in the performance of the QR based detector. Appropriately ordered channel matrix can effectively gen-erate the desired statistical properties of diagonal elements of Rmatrix. An analytical framework to characterize statistical proper-ties of diagonal elements of R matrix under the general ordering rules is presented. The mathematical procedure can be applied to any static ordering rule even in non-linear ordering rule for QR-based MIMO decoder.

5. Numerical results

In this section, numerical results are provided to verify the ef-fectiveness of the proposed decoding algorithms and preprocessing techniques. In addition to the decoding algorithms, MMSE-based robust fixed complexity sphere decoding algorithm (FSDA-MMSE) proposed in [15] is also evaluated for performance comparisons. For FSDA-MMSE, Q Nt−Nr candidate points are searched. Therefore, as the difference between numbers of transmit and receive anten-nas increases, the candidate points to be searched also increase. On the other hand, searching more candidate points contributes to better performance in terms of decoding error rate. As a result, the error rate performance of “FSD-MMSE” is almost the same as ML decoder when the difference between numbers of transmit and re-ceive antennas is large. Nonetheless, the computational complexity of the FSDA-MMSE in this case is also large. The property men-tioned above can be observed in the following simulations. The decoding performance is measured in terms of bit error rate (BER). On the other hand, the computational complexity is measured in terms of the average number of floating point operations (flops)


Fig. 3. Probability density function of various ordering rules.

Fig. 4. The comparison of the averaged complexity reduction ratio for various order-ing rules.

Fig. 5. BER performance for a (5,4) MIMO system with 64-QAM.

and includes all involved operations (e.g. QR decomposition). All real additions, multiplications, and comparisons are treated equally as flops. The parameters of FSDA-MMSE follow [15]. The radius Cof “ESS + SIC-RSSDA” is chosen according to [4] with Φ = 0.99, = 0.01 [13]. Note that the preprocessing with greedy ordering rule is adopted in “ESSD + SDA” and “ESS + SIC-RSSDA”. In addi-tion, performance of ‘SSD + SDA’ and the proposed ‘ESS + SDA’ are the same as the ML detector.

First, we investigate the probability density function of |y′M | un-

der the aforementioned various ordering rules, and show in Fig. 3the (4, 2) 16-QAM underdetermined MIMO system at SNR = 15 dB. From Fig. 3, the pdf of |y′ | using the proposed greedy order-
M
Fig. 6. Computational complexity for a (5,4) MIMO system with 64-QAM.

ing rule, with the advantage of lower complexity, shows that the probability density of |y′

M | tends to be distributed away from the origin and is almost identical to the pdf of the exhaustive search approach. On the other hand, the results also show that the pro-jection ordering rule is still an effective approach.

Furthermore, we choose 16-QAM underdetermined MIMO sys-tem with Nr = 2, Nt = 3, 4, and 5, respectively, and set SNR =15 dB. The complexity reduction ratio of Np is shown in Fig. 4. It is obvious that the reduction ratio will be saturated when the antenna number difference increases. The phenomenon is due to that the ratio of Np increases faster rather than the value of |y′

M | when the antenna number difference increases. However, the greedy ordering rule can still provide significant reductions even when the Np increases exponentially. In Fig. 4, it is observed that the numerical result of the developed framework closely follows the Monte-Carlo trials, which verifies the accuracy of the mathe-matical analysis.

We next investigate the performance of (5, 4) MIMO systems with 64-QAM, which represents MIMO systems with the large constellation size. Fig. 5 shows the BER performance of the de-coding algorithms. Because both “ESS + SDA” and “SSD + SDA” can achieve ML performance, the performance of “SSD + SDA” does not shown in this figure. In this figure, the proposed “ESS + SDA” and “ESS + SIC-RSSDA” have similar performance, and both of them outperform “FSDA-MMSE”. Fig. 6 shows the computational com-plexity of the above decoding algorithms. It is easily seen that the proposed ESS algorithm can effectively achieve lower the computa-tional complexity than SSD algorithm. Moreover, the proposed SIC-RSSDA can reduce computational complexity of traditional SDA, especially in low SNR regime.

Finally, we choose the (5, 2) MIMO systems with 4-QAM to rep-resent MIMO systems with large difference in number of transmit–receive antennas. Fig. 7 shows the BER performance of the de-coding algorithms. In this figure, “FSD-MMSE” achieves almost the same BER as “ESS + SDA”, which is better than the one of “ESS + SIC-RSSDA”. The reason is that SIC-RSSDA often discards the ML candidate during the decoding process, which causes the per-formance degradation. For “FSD-MMSE”, it searches Q Nt−Nr can-didate points, which is 64 in this case and is far larger than the candidate points in “ESS + SDA” and “ESS + SIC-RSSDA”. Therefore “FSD-MMSE” can achieve near ML performance at the expense of greater computational complexity. Fig. 8 shows the computational complexity of the decoding algorithms. In this figure, it can be seen that proposed “ESS+SDA” and “ESS+SIC-RSSDA” can achieve much less computational complexity than “SSD + SDA” and “FSDA-MMSE” in all SNR regimes.

In this section, simulation results show that the proposed “ESS + SDA” can achieve better performance than benchmark de-coding algorithms when the constellation size or difference in number of transmit–receive antennas is large. Besides, the pro-posed “ESS + SIC-RSSDA” can achieve even lower computational complexity than “ESS + SDA” for some BER performance degrada-


Fig. 7. BER performance for a (5,2) MIMO system with 4-QAM.

Fig. 8. Computational complexity for a (5,2) MIMO system with 4-QAM.

tion. The simulations also confirm that the theoretic analysis and mathematical derivation on the complexity closely align, and the analysis can be used for predicting the complexity reduction.

6. Conclusions

In this paper, we propose a low-complexity ML decoding algo-rithms for underdetermined MIMO systems. In addition, we pro-pose two preprocessing schemes and prove that the search strat-egy can significantly reduce the computational complexity within lower SNR regime without any performance degradation. A gen-eral analytical framework of the static ordering rule for QR based MIMO decoder is also presented. To further reduce the computa-tional complexity, a sub-optimal decoding algorithm is also devel-oped. Simulation results show that the proposed ML decoder can achieve lower complexity than existing low-complexity ML and fixed-complexity decoders. In addition, simulations also confirm the mathematical analysis of the proposed preprocessing scheme. The result indicates that the optimal ordering rule depends on the column vectors of channel matrix and also on received signal vector. The proposed low-complexity decoding algorithms are suit-able for real-time applications, e.g., downlink MIMO transmission with number of transmit antennas larger than receive antennas, and provide a promising solution for MIMO wireless communica-tion systems such as LTE-Advanced. For MIMO systems with larger dimensions such as Massive MIMO systems, the ML decoding algo-rithm has too high complexities. The techniques proposed in this paper may be properly modified to be applicable to the Massive MIMO systems. The low-complexity decoding algorithm for Mas-sive MIMO system is considered as one of the future works.

Appendix A

We reorder the columns of H to be Ho = [ho,1, ho,2, . . . ho,N ]according to the projection norm of received signal vector y in as-cending order ‖yT ho,1‖ ≤ ‖yT ho,2‖ ≤ · · · ≤ ‖yT ho,N‖. Here ho,i can

be expressed as ho,i = √wo,iθ i , where wo,i is the ith order statis-

tic of N independent Gamma (M, 1) distributed random variables; the {θ i} are i.i.d. uniformly distributed on the unit sphere in RM

[23,24]; wo,i and θ i are mutually independent. With QR decompo-sition of Ho = QoRo , we can characterize the distribution of square of the Mth diagonal entry of Ro , denoted by r2

o,M,M .First, letting Qo = [qo,1, qo,2, . . . , qo,M ] and performing the QR

decomposition of Ho(Ho = QoRo), we obtain

r2o,i,i = wo,i

[1 −

i−1∑k=1

(qT

o,kθ i)2]

= wo,M

[1 −

i−1∑k=1

θ2i (k)

]= wo,i si, (20)

where

si =[

1 −i−1∑k=1

θ2i (k)

](21)

and θ i(k) denotes the kth element of θ i . Note that the second equation holds due to the fact that the distribution of θ i is in-variant under the orthogonal transformation by Qo . To drive the cdf of r2

o,M,M , we should obtain the pdf of wo,M and sM . In order to obtain the characteristic of wo,M , we first define a new random variable zi = ‖yT (

√2hi)‖4 and the projection ordering rule can be

formulated as a sorting process dealing as{∥∥yT ho(1)

∥∥,∥∥yT ho(2)

∥∥, · · · ,∥∥yT ho(N)

∥∥}= sort

{∥∥yT h1∥∥,∥∥yT h2

∥∥, · · · ,∥∥yT hN∥∥}

= sort{∥∥yT (

√2h1)

∥∥4,∥∥yT (

√2h2)

∥∥4, · · · ,∥∥yT (

√2hN)

∥∥4}= sort{z1, z2, · · · , zN}. (22)

From [25], θ i can be modeled from a M-dimensional random vec-tor V = [v1, v2, . . . , v M ]T with v i ∼ i.i.d. N(0, 1), where θ i(k) =

vk‖V‖ = vk√v2

1+v22+···+v2

M

. Therefore, zi can be expressed as

zi = ∥∥yT (√

2hi)∥∥4 = wi

(∑M

k=1 ykθ i(k))2

v21 + v2

2 + · · · + v2M

= wimi

ti

= wi gi, (23)

where the random variables wi and ti have the Gamma distribu-tion (M, 1) and chi-square distribution with M degrees of freedom, respectively. The pdf of mi can be approximated as

fmi (m) =⎧⎨⎩

1√2πσ 2

m

m− 12 e

− m2σ2

m m > 0

0 m ≤ 0

, where σ 2m = Mσy . (24)

According to (23), we can derive the pdf of gi as

f gi (g) =∞∫

−∞|t| fmi (tg) ft(t)dt =

σv g− 12 (1 + σ 2

v

σ 2m

g)− M+12

σmβ( M2 , 1

2 ). (25)

With the characteristic of gi , the cdf of zi can be expressed as

F zi (z) =∞∫

0

f gi (g)F w

(z

g

)dg. (26)

Since both wi and gi are non-negative random variables, according the order statistics [26], we can express fo,zi (z) as


fo,zi (z) = N![F z(z)]i−1[1 − F z(z)]N−i f z(z)

(i − 1)!(N − i)! . (27)

From (25) and (27), the pdf of wo,i can be obtained as

f wo,i (w) =∞∫

−∞f(

w|z) fo,zi (z)dz

=∞∫

−∞

f zi wi (z, w)

f zi (z)f zo,i (z)dz

= 1

β(i, N − i + 1)

∞∫−∞

1

|w| f w(w) f gi

(z

w

)[F z(z)

]i−1

× [1 − F z(z)]N−i

dz. (28)

Next, because θ Hi θ i = 1, si can be rewritten as

si =[

1 −i−1∑k=1

θ2i (k)

]=

M∑k=i

θ2i (k). (29)

Therefore, we have

si =M∑

k=i

θ2i (k) = v2

i + v2i+1 + · · · + v2

M

v21 + v2

2 + · · · + v2M

= qi

pi, (30)

where qi and pi are chi-square random variables with (M − i + 1) and M degrees of freedom, respectively. The joint pdf of qM and pM is

fqM ,pM (q, p) = fqM (q) · f pM−qM (p − q)

= fχ(1)(q) · fχ(M−1)(p − q)

= q− 12 · (p − q)

M−32 · e− p

2

2M2 · Γ ( 1

2 ) · Γ ( M−12 )

, for p > 0 and q > 0,

(31)

where fχ(k)(x) denotes the pdf of the chi-square random variable with k degrees of freedom. The pdf of sM can be expressed as

f sM (s) =∞∫

0

|p| fqM ,pM (ps, p)dp

= s− 12 (1 − s)

(M−3)2 ( M−2

2 )!Γ (

(M−1)2 )Γ ( 1

2 )(∵

∞∫0

xne−μxdx = n!μ−n−1

). (32)

Since wo,M and sM are independent, the joint pdf of wo,M and sM

is

f wo,M ,sM (w, s) = f wo,M (w) · f sM (s). (33)

As a result, the cdf of r2o,M,M can be expressed as

Fr2o,M,M

(r)

=1∫ { r/s∫

f wo,M ,sM (w, s)dw

}ds

0 0

=1∫

0

r/s∫0

f wo,M (w) f Ss (s)dwds

=1∫

0

r/s∫0

⎧⎪⎪⎪⎨⎪⎪⎪⎩

∞∫−∞

1

|w| f w (w) f g

(z

w

)[F z(z)

]M−1[1 − F z(z)

]N−Mdz

C

(s− 1

2 (1 − s)(M−3)

2

(M − 2

2

)!)

⎫⎪⎪⎪⎬⎪⎪⎪⎭

dwds.

(34)

References

[1] S.J. Lu, R.Y. Chang, W.H. Chung, C.E. Chen, Realizing high accuracy transmission in high-rate data broadcasting networks with heterogeneous users via cooper-ative communication, Digit. Signal Process. 25 (Feb. 2014) 93–103.

[2] G.J. Foschini, M.J. Gans, On limits of wireless communications in a fading en-vironment when using multiple antennas, Wirel. Pers. Commun. 6 (3) (Mar. 1998) 311–355.

[3] C.P. Schnorr, M. Euchner, Lattice basis reduction: improved practical algorithms and solving subset sum problems, Math. Program. 66 (Aug. 1994) 181–191.

[4] B. Hassibi, H. Vikalo, On the sphere-decoding algorithm I. Expected complexity, IEEE Trans. Signal Process. 53 (8) (Aug. 2005) 2806–2818.

[5] M. El-Khamy, M. Medra, H.M. ElKamchouchi, Reduced complexity list sphere decoding for MIMO systems, Digit. Signal Process. 25 (Feb. 2014) 84–92.

[6] M.O. Damen, K. Abed-Meraim, J.-C. Belfiore, A generalised sphere decoder for asymmetrical space–time communication architecture, Electron. Lett. 36 (2) (Jan. 2000) 166–167.

[7] T. Cui, C. Tellambura, An efficient generalized sphere decoder for rank-deficient MIMO systems, IEEE Commun. Lett. 9 (5) (May 2005) 423–425.

[8] P. Wang, T.L. Ngoc, A low-complexity generalized sphere decoding approach for underdetermined linear communication systems: performance and complexity evaluation, IEEE Trans. Commun. 57 (11) (Nov. 2009) 3376–3388.

[9] G. Romano, F. Palmieri, P.S. Rossi, D. Mattera, A tree-search algorithm for ML decoding in underdetermined MIMO systems, in: Proc. ISWCS 2009, 2009, pp. 662–665.

[10] G. Romano, F. Palmieri, P.S. Rossi, F. Palmieri, Tree-search ML detection for underdetermined MIMO systems with M-PSK constellations, in: Proc. ISWCS 2010, 2010, pp. 102–106.

[11] Z. Yang, C. Liu, J. He, A new approach for fast generalized sphere decoding in MIMO systems, IEEE Signal Process. Lett. 12 (1) (Jan. 2005) 41–44.

[12] X.W. Chang, X. Yang, An efficient tree search decoder with column record-ing for underdetermined MIMO systems, in: Proc. IEEE GLOBECOM 2007, 2007, pp. 4375–4379.

[13] K.K. Wong, A. Paulraj, Efficient near maximum-likelihood detection for under-determined MIMO antenna systems using a geometrical approach, EURASIP J. Wirel. Commun. Netw. 2007 (Oct. 2007) 084265.

[14] K.K. Wong, A. Paulraj, R.D. Murch, Efficient high-performance decoding for overloaded MIMO antenna systems, IEEE Trans. Wirel. Commun. 6 (5) (May 2007) 1833–1843.

[15] Y. Ding, Y. Wang, J.F. Diouris, Z. Yao, Robust fixed-complexity sphere decoders for rank-deficient MIMO systems, IEEE Trans. Wirel. Commun. 12 (9) (Sep. 2013) 4297–4305.

[16] L. Bai, C. Chen, J. Choi, Prevoting cancellation-based detection for underdeter-mined MIMO systems, EURASIP J. Wirel. Commun. Netw. 2010 (April 2010) 96.

[17] S.J. Chern, M.K. Cheng, P.S. Chao, Blind Capon-like adaptive ST-BC MIMO-CDMA receiver based on constant modulus criterion, Digit. Signal Process. 23 (6) (Aug. 2013) 1958–1966.

[18] K. Liu, S.S. Xing, Combined multi-stage MMSE and ML multiuser detection for underdetermined MIMO systems, in: Proc. CCWMC 2011 IET, 2011, pp. 10–14.

[19] T. Datta, N. Srinidhi, A. Chockalingam, B.S. Rajan, Low-complexity near-optimal signal detection in underdetermined large-MIMO systems, in: Proc. NCC 2012, 2012, pp. 1–5.

[20] K. Lee, J. Chun, ML symbol detection based on the shortest path algorithm for MIMO systems, IEEE Trans. Signal Process. 55 (11) (Nov. 2007) 5477–5484.

[21] A.K. Lenstra, H.W. Lenstra, L. Lovasz, Factoring polynomials with rational coef-ficients, Math. Ann. 261 (4) (1982) 513–534.

[22] C.J. Huang, C.Y. Wu, T.S. Lee, Geometry based efficient decoding algorithms for underdetermined MIMO systems, in: Proceedings of the IEEE SPAWC 2011, June 2011, pp. 371–375.

[23] W. Zhao, G.B. Giannakis, Reduced complexity closest point decoding algorithms for random lattices, IEEE Trans. Wirel. Commun. 5 (1) (Jan. 2006) 101–111.

[24] K.K. Wong, A. Paulraj, Near maximum-likelihood detection with reduced-complexity for multiple-input single-output antenna systems, in: Proc. Asilo-mar Conf. on Signals, Systems, and Computers, Nov. 2004.

[25] M.E. Muller, A note on a method for generating points unformly on n-dimen-sional spheres, Commun. ACM 2 (1959) 19–20.

http://refhub.elsevier.com/S1051-2004(15)00080-9/bib31s1

































































[26] N. Balakrishnan, A.C. Cohen, Order Statistics and Inference Estimation Methods, Academic Press, New York, NY, USA, 1991.

Chung-Jung Huang received the M.S. and Ph.D. degrees in Institute of Communications Engineering from National Chiao Tung University, Hsinchu, Taiwan, R.O.C, in 1994 and 2013, respectively. Since Septem-ber 2013, he has been a Technical Director with MediaTek Inc., Hsinchu, a leading fabless semiconductor company in wireless communications and digital multimedia industries. His research interests include statistical sig-nal processing and receiver design, with a particular focus on MIMO-OFDM systems.

Chang-Shen Lee received the B.S. degree in Electrical Engineering and Computer Science Undergraduate Honors Program from National Chiao Tung University, Taiwan, in 2012. In 2014, he received the M.S. degree in Institute of Communications Engineering in National Chiao Tung Univer-sity, Taiwan. His research interests include cognitive radio networks and resource management for wireless communication systems.

Wei-Ho Chung received the B.S. and M.S. degrees in Electrical Engi-neering from the National Taiwan University, Taipei, Taiwan, in 2000 and 2002, respectively, and the Ph.D. degree in Electrical Engineering from the University of California, Los Angeles, in 2009. From 2002 to 2005, he was a system engineer at ChungHwa Telecommunications Company, where he worked on data networks. In 2008, he worked on CDMA systems at Qual-comm, Inc., San Diego, CA. His research interests include communications, signal processing, and networks. Dr. Chung received the Taiwan Merit Scholarship from 2005 to 2009 and the Best Paper Award in IEEE WCNC 2012, and has published over 40 journal articles and over 50 conference papers. Since January 2010, Dr. Chung has been an assistant research fel-low, and promoted to the rank of associate research fellow in January 2014

in Academia Sinica. He leads the Wireless Communications Lab in the Research Center for Information Technology Innovation, Academia Sinica, Taiwan.

Ta-Sung Lee received the B.S. degree from National Taiwan University in 1983, the M.S. degree from University of Wisconsin, Madison, in 1987, and the Ph.D. degree form Purdue University, W. Lafayette, IN, in 1989, all in electrical engineering. In 1990, he joined the Faculty of National Chiao Tung University (NCTU), Hsinchu, Taiwan, where he holds a position as Professor of Department of Electrical and Computer Engineering. From 1999 to 2001, he was Director of Communications & Computer Continuing Education Program, NCTU. From 2005 to 2007, he was Chairman of De-partment of Communication Engineering, and from 2007 to 2008, he was Dean of Student Affairs of NCTU. From 2008 to 2010, he was Commis-sioner of the National Communications Commission (NCC), a regulatory agency of Taiwan similar to the FCC, and responsible for the strategic planning, policy making and technical regulation for the telecommuni-cations and broadcasting services in Taiwan. He was Vice Chairman and Chairman of IEEE Communications Society Taipei Chapter for 2005–2008, a Board Member of IEEE Taipei Section for 2007–2010, an Associate Editor of IEEE Transactions on Signal Processing for 2009–2013, and IEEE Sig-nal Processing Society Regional Director-at-Large for R10 for 2011–2013. He is currently an Associate Editor of Journal of Signal Processing Sys-tems. He has been Chairman of Telecom Technology Center, a govern-ment funded agency for telecommunications R&D, since 2013. Dr. Lee is actively involved in research and development in signal processing and system design for wireless communications. He has won several awards for his research, engineering and education contributions; these include National Science Council (NSC) Research Award, 1999 Young Electrical En-gineer Award of the Chinese Institute of Electrical Engineering (CIEE), 2011 Distinguished Electrical Engineering Professor Award of CIEE, NCTU Distin-guished Scholar Award, and NCTU Teaching Award.



Digital Signal Processing · 2018-03-26 · processing

Documents

Transcript of Digital Signal Processing · 2018-03-26 · processing