· MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath,...

134
Exploiting structure and pseudoconvexity in iterative parallel optimization algorithms for real time and large scale applications Yang Yang * and Marius Pesavento # [email protected] [email protected] *University of Luxembourg, # Technische Universität Darmstadt 1 Yang Yang and Marius Pesavento University of Luxembourg and Technische Universität Darmstadt

Transcript of  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath,...

Page 1:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Exploiting structure and pseudoconvexity in iterative paralleloptimization algorithms for real time and large scale applications

Yang Yang∗ and Marius Pesavento#

[email protected] [email protected]*University of Luxembourg, #Technische Universität Darmstadt

1Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 2:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Acknowledgement

• Ganapati Hegde• Minh Trinh Hoang• Christian Steffens

Financial support from• ADEL (EU FP7)• EXPRESS (DFG Priority Program)• Fantastic5G (EU Horizon 2020)• One5G (EU Horizon 2020 Phase II)

2Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 3:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Table of contents

1 TheoryProblem formulation and motivating examplesDescent direction and stepsizeFunctions with different level of convexityDesign of approximate functions

2 ApplicationsMIMO MAC capacity maximizationMIMO BC capacity maximizationGlobal energy efficiency maximization in MIMO systemsSum energy efficiency maximization in MIMO systemsNondifferentiable problems and LASSOMD rank sparse regularization for MIMO channel estimation

3Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 4:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Outline

1 TheoryProblem formulation and motivating examplesDescent direction and stepsizeFunctions with different level of convexityDesign of approximate functions

2 ApplicationsMIMO MAC capacity maximizationMIMO BC capacity maximizationGlobal energy efficiency maximization in MIMO systemsSum energy efficiency maximization in MIMO systemsNondifferentiable problems and LASSOMD rank sparse regularization for MIMO channel estimation

4Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 5:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Problem formulation: differentiable problems• Consider the following optimization problem:

(P ) :minimize

xf(x)

subject to x ∈ X .

Assume thatf(x) is differentiable and ∇f(x) is continuous;X is closed and convex;a solution of (P ) exists.

• Objective: Iterative algorithms that can solve (P ) efficiently.• Application: mutual information maximization of the MIMO BC

maximize(Qk)K

k=1

log∣∣I +

∑Kk=1HH

k QkHk

∣∣subject to Qk 0, k = 1, . . . ,K,

∑Kk=1tr(Qk) ≤ P.

5Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 6:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Iterative algorithms: variable update ruleGiven xt at iteration t:

• Step 1: We find a new point Bxt ∈ X .Notation: Bxt is a function of xt such that Bxt − xt is a “good”direction.

• Step 2: We set xt+1 to be a point along that direction:xt+1 = xt + γt︸︷︷︸

stepsize

· (Bxt − xt)︸ ︷︷ ︸update direction

, 0 < γ ≤ 1.

Since X is convex, xt+1 ∈ X as long as xt ∈ X and Bxt ∈ X .• Step 3: t← t+ 1 and go back to Step 1.

Bxt

xt

X

constraint set

Bxt

xt

xt+1, with γ = 0.6X

constraint set

6Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 7:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Iterative algorithms: variable update rule

∇f(xt)

BxtBxt − x

txt

Surface of equal cost f(x)

Constraint set X

Figure: Descent direction step: xt+1 = xt + γt(Bxt − xt)

7Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 8:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Iterative algorithms: variable update rule

∇f(xt)

BxtBxt − x

txt

Surface of equal cost f(x)

γt

Constraint set X

Figure: Descent direction step: xt+1 = xt + γt(Bxt − xt)

7Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 9:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Iterative algorithms: variable update rule

∇f(xt)

BxtBxt − x

txt

Surface of equal cost f(x)

γt

Constraint set X

Figure: Descent direction step: xt+1 = xt + γt(Bxt − xt)

7Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 10:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Iterative algorithms: variable update rule

∇f(xt)

BxtBxt − x

txt

Surface of equal cost f(x)

γt?

Constraint set X

Figure: Descent direction step: xt+1 = xt + γt(Bxt − xt)

7Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 11:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Iterative algorithms: variable update rule

∇f(xt)

BxtBxt − x

txt

Surface of equal cost f(x)

γt?γt

Constraint set X

Figure: Descent direction step: xt+1 = xt + γt(Bxt − xt)

7Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 12:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Iterative algorithms: variable update rule

∇f(xt)

BxtBxt − x

txt

Surface of equal cost f(x)

γt?γt

Constraint set X

Figure: Descent direction step: xt+1 = xt + γt(Bxt − xt)

7Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 13:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Iterative algorithms: variable update rule

∇f(xt)

BxtBxt − x

txt

Surface of equal cost f(x)

γt?

γt

Constraint set X

Figure: Descent direction step: xt+1 = xt + γt(Bxt − xt)

7Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 14:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Iterative algorithms: variable update rule

xt

∇f(xt )

BxtBxt −

xt

Surface of equal cost f(x)

Constraint set X

Figure: Descent direction step: xt+1 = xt + γt(Bxt − xt)

8Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 15:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Iterative algorithms: variable update rule

xt

∇f(xt )

BxtBxt −

xt

Surface of equal cost f(x)

γt

Constraint set X

Figure: Descent direction step: xt+1 = xt + γt(Bxt − xt)

8Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 16:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Iterative algorithms: variable update rule

xt

∇f(xt )

BxtBxt −

xt

Surface of equal cost f(x)

γt

Constraint set X

Figure: Descent direction step: xt+1 = xt + γt(Bxt − xt)

8Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 17:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Iterative algorithms: variable update rule

xt

∇f(xt )

BxtBxt −

xt

Surface of equal cost f(x)

γt?

Constraint set X

Figure: Descent direction step: xt+1 = xt + γt(Bxt − xt)

8Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 18:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Iterative algorithms: variable update rule

xt

∇f(xt )

BxtBxt −

xt

Surface of equal cost f(x)

γt?γt

Constraint set X

Figure: Descent direction step: xt+1 = xt + γt(Bxt − xt)

8Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 19:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Iterative algorithms: variable update rule

xt

∇f(xt )

BxtBxt −

xt

Surface of equal cost f(x)

γt?γt

Constraint set X

Figure: Descent direction step: xt+1 = xt + γt(Bxt − xt)

8Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 20:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Iterative algorithms: variable update rule

xt

∇f(xt )

BxtBxt −

xt

Surface of equal cost f(x)

γt?

γt

Constraint set X

Figure: Descent direction step: xt+1 = xt + γt(Bxt − xt)

8Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 21:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Iterative algorithms: variable update rule

xt∇f

(xt )

BxtBxt − xt

Surface of equal cost f(x)

Constraint set X

Figure: Descent direction step: xt+1 = xt + γt(Bxt − xt)

9Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 22:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Iterative algorithms: variable update rule

xt∇f

(xt )

BxtBxt − xt

Surface of equal cost f(x)

γt

Constraint set X

Figure: Descent direction step: xt+1 = xt + γt(Bxt − xt)

9Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 23:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Iterative algorithms: variable update rule

xt∇f

(xt )

BxtBxt − xt

Surface of equal cost f(x)

γt

Constraint set X

Figure: Descent direction step: xt+1 = xt + γt(Bxt − xt)

9Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 24:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Iterative algorithms: variable update rule

xt∇f

(xt )

BxtBxt − xt

Surface of equal cost f(x)

γt

Constraint set X

Figure: Descent direction step: xt+1 = xt + γt(Bxt − xt)

9Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 25:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Iterative algorithms: variable update rule

xt∇f

(xt )

BxtBxt − xt

Surface of equal cost f(x)

γt?

Constraint set X

Figure: Descent direction step: xt+1 = xt + γt(Bxt − xt)

9Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 26:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Iterative algorithms: variable update rule

x?

Surface of equal cost f(x)

Constraint set X

Figure: Descent direction step: xt+1 = xt + γt(Bxt − xt)

10Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 27:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

The algorithms that we WILL and will NOT cover

We WILL cover

gradient (projection) method

for nonconvexproblmes:f(xt+1) <f(xt).

conditional gradient methodblock coordinate descent(Gauss-Seidel) algorithmJacobi algorithmproximal (gradient) methodsuccessive convex approximationblock successive upper-boundminimization (BSUM) method

We will NOTcover

Newton’s method for convexproblems:∥∥xt+1 − x?

∥∥ <∥∥xt − x?∥∥.

primal-dual algorithminterior point methodaugmented Lagrangian and ADMM

11Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 28:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Problem formulation: nondifferentiable problems• Consider the following problem where g(x) is nondifferentiable:

minimizex

f(x)︸ ︷︷ ︸differentiable

+ g(x)︸ ︷︷ ︸nondifferentiable

subject to x ∈ X .

• If g(x) is convex, the above problem can be rewritten as:

minimizex,y

f(x) + y ← differentiablesubject to x ∈ X , g(x) ≤ y ← convex.

• Application: LASSO in sparse regularization

minimizex

12 ‖Ax− b‖22︸ ︷︷ ︸

,f(x)

+µ ‖x‖1︸ ︷︷ ︸,g(x)

.

12Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 29:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Stationary point• If x? is a local minimum of (P ) (i.e., it is a local minimum of f(x) overX ), then

first-order optimality condition: ∇f(x?)T (x− x?) ≥ 0, ∀x ∈ X .

For any arbitrary but fixed x ∈ X :

0 ≤ limγ↓0

f(x? + γ(x− x?))− f(x?)γ

= ∇γf(x? + γ(x− x?))∣∣γ=0

= ∇xf(x? + γ(x− x?))∣∣γ=0∇γ(x? + γ(x− x?))

= ∇xf(x?)T (x− x?)Note that x? + γ(x− x?) is in the vicinity of x? when γ ↓ 0.Necessary optimality condition. Sufficient if (P ) is convex.

• If X = Rn, then the above optimality condition reduces to∇f(x?) = 0.

13Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 30:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Stationary point

• A point x? satisfying the above optimality condition is called astationary point of (P ).

• It is the classic goal of algorithmic design in nonlinear programming.• Stationary point and KKT point are generally not equivalent:

Even for convex problems, constraint qualifications must be satisfied toguarantee the existence of a KKT point.

14Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 31:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Descent direction• Variable update in iterative algorithms:

xt+1 = xt + γt · (Bxt − xt).

Notation: Bxt is a function of xt such that Bxt − xt is a “good”direction.

• A vector Bxt − xt is a descent direction of f(x) at x = xt if

∇f(xt)T (Bxt − xt) < 0.

• From the first-order Taylor expansion around xt:

f(xt + γ(Bxt − xt)) = f(xt) + γ∇f(xt)T (Bxt − xt)︸ ︷︷ ︸<0

+o(γ∥∥Bxt − xt

∥∥)︸ ︷︷ ︸

<0 if γ is sufficiently small

f(xt+1) < f(xt)

• If Bxt − xt is a descent direction, then there exists a γt > 0 such that

f(xt + γt(Bxt − xt)) < f(xt),∀γt ∈ (0, γt].

See Ortega and Rheinboldt (1970, Ch. 8.2.1).

15Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 32:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Descent direction• Variable update in iterative algorithms:

xt+1 = xt + γt · (Bxt − xt).

Notation: Bxt is a function of xt such that Bxt − xt is a “good”direction.

• A vector Bxt − xt is a descent direction of f(x) at x = xt if

∇f(xt)T (Bxt − xt) < 0.

• From the first-order Taylor expansion around xt:

f(xt + γ(Bxt − xt)) = f(xt) + γ∇f(xt)T (Bxt − xt)︸ ︷︷ ︸<0

+o(γ∥∥Bxt − xt

∥∥)︸ ︷︷ ︸

<0 if γ is sufficiently small

f(xt+1) < f(xt)

• If Bxt − xt is a descent direction, then there exists a γt > 0 such that

f(xt + γt(Bxt − xt)) < f(xt),∀γt ∈ (0, γt].

See Ortega and Rheinboldt (1970, Ch. 8.2.1).

15Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 33:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Descent direction

• In the (unconstrained) gradient method:

Bxt = xt −∇f(xt).

• Bxt − xt is a descent direction:

∇f(xt)T (Bxt − xt) = ∇f(xt)T (xt −∇f(xt)− xt)

= −∥∥∇f(xt)

∥∥22< 0.

16Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 34:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Descent direction• In the conditional gradient method:

Bxt = arg minx∈X

∇f(xt)T (x− xt).

∇f(xt)

Bxt

Bxt − xt

xt

Surface of equal cost f(x)

Constraint set X

17Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 35:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Descent direction• In the conditional gradient method:

Bxt = arg minx∈X

∇f(xt)T (x− xt).

• Assume Bxt − xt and from the optimality of Bxt:

∇f(xt)T (Bxt − xt) = minx∈X∇f(xt)T (x− xt) < ∇f(xt)T (xt − xt) = 0.

• Therefore Bxt − xt is a descent direction:

∇f(xt)T (Bxt − xt) < 0.

• At each iteration, an approximate problem, denoted as f(x; xt), issolved:

f(x; xt) , ∇f(xt)T (x− xt).

18Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 36:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Descent direction• In the gradient projection method:

Bxt = [xt − s∇f(xt)]X = arg minx∈X

∥∥x− (xt − s∇f(xt))∥∥2

2︸ ︷︷ ︸,f(x;xt)

,

where s > 0 and [•]X is the projection operator.

∇f(xt)

Bxt−s∇f(xt)Bxt − xt

xt

Surface of equal cost f(x)

Constraint set X

19Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 37:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Descent direction• In the gradient projection method:

Bxt = [xt − s∇f(xt)]X = arg minx∈X

∥∥x− (xt − s∇f(xt))∥∥2

2︸ ︷︷ ︸,f(x;xt)

,

where s > 0 and [•]X is the projection operator.• From the first-order optimality condition (i.e.,∇f(x?)T (x− x?) ≥ 0, ∀x ∈ X ):

(Bxt − (xt − s∇f(xt)))T (x− Bxt) ≥ 0, ∀x ∈ X .

• Setting x = xt in the first-order optimality condition yields

(Bxt − (xt − s∇f(xt)))T (xt − Bxt) ≥ 0⇓

∇f(xt)T (Bxt − xt) ≤ −1s

∥∥Bxt − xt∥∥2

2< 0.

20Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 38:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Stepsize• Variable update in iterative algorithms:

xt+1 = xt + γt · (Bxt − xt).• How far do we move along the descent direction?

If Bxt − xt is a descent direction, then there exists a γt > 0 such thatf(xt + γt(Bxt − xt)) < f(xt),∀γt ∈ (0, γt].

x

f(x)

(0,0)

xt+1?

xt+1?

xt+1?

xt

21Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 39:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Stepsize• Constant stepsize: γt = γ, where γ is sufficiently small.• Decreasing stepsize: γt → 0,

∑∞t=0γ

t =∞. For example, γt = 1/t.• Exact line search:

γt ∈ arg min0≤γ≤1

f(xt + γ(Bxt − xt)).

Could be performed efficiently by bisection if f(x) is convex.

x

f(x)

(0,0)

xt+1

xtγt

22Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 40:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Stepsize• Successive line search (Armijo rule): given two constants 0 < α, β < 1,we try γt = 1, β1, β2, ... until the following is satisfied

f(xt + βm(Bxt − xt)︸ ︷︷ ︸xt+1

)− f(xt) ≤ αβm∇f(xt)T (Bxt − xt)︸ ︷︷ ︸<0

.

Then γt = βmt while mt is the smallest integer of such a m.

∗∗∗

Set of acceptable stepsizes

Unsuccessful stepsize trials

stepsize γt = β2 β 1 γ

f(xt + γ(Bxt − xt))− f(xt)

∇f(xt)T (Bxt − xt) · γ

α∇f(xt)T (Bxt − xt) · γ

23Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 41:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Convex functions• A function is (strictly) convex if

f(y)(>) ≥ f(x) +∇f(x)T (y− x), ∀x,y ∈ X .

A convex function is lower bounded by its first-order approximation.• A function is strongly convex with constant a > 0 if

f(y) ≥ f(x) +∇f(x)T (y− x)+a

2 ‖x− y‖22,∀x,y ∈ X .

The difference must be big enough.

x

f(x)

24Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 42:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Convex functionsCharacterizations of strongly convex functions

• A function f(x) is strongly convex in X if and only if for some a > 0:∇2f(x) aI, ∀x ∈ X .

Why the left function is not strongly convex?• A function f(x) is strongly convex in X if and only if for some a > 0:

(∇f(x)−∇f(y))T (x− y) ≥ a ‖x− y‖22 ,∀x,y ∈ X .• See Boyd and Vandenberghe (2004, Ch. 9.1.2).

x

f(x)

25Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 43:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Quasiconvex functions

• A function f(x) is (strictly) quasiconvex (or unimodal) if

f((1− α)x + αy)(<) ≤ max(f(x), f(y)), ∀α ∈ (0, 1).

The function value at any point between endpoints x and y is smallerthan either of the two endpoints.A function f(x) is quasiconvex if and only if it is unimodal.

yx

f(x)

f(y)

26Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 44:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Pseudoconvex functions

• A function f(x) is pseudoconvex if (Mangasarian, 1969):

f(y) < f(x) =⇒ ∇f(x)T (y− x) < 0.

• Function f1(x) is pseudoconvex, because

f1(x) < f1(y) and ∇f1(y)︸ ︷︷ ︸>0

(x− y)︸ ︷︷ ︸<0

< 0.

• Function f2(x) is quasiconvex but notpseudoconvex, because

f2(y) < f2(x) but ∇f2(x)︸ ︷︷ ︸=0

(y − x)︸ ︷︷ ︸>0

= 0. yx

f1

f2

27Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 45:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Pseudoconvex functions• A function f(x) is said to be pseudoconvex if

f(y) < f(x) =⇒ ∇f(x)T (y− x) < 0.

• Similar to convex functions, every stationary point of a pseudoconvexfunction is global minimum. To see this:

Suppose x? is a stationary point but not a global minimum.Suppose y? is a global minimum.Since x? is a stationary point, then

∇f(x?)T (y− x?) ≥ 0,∀y ∈ X .

Since y? is a global minimum, then

f(y?) < f(x?) =⇒ ∇f(x?)T (y? − x?) < 0.

A contradiction is derived.• Different from convex functions, the sum of pseudoconvex functions isnot necessarily pseudoconvex (shown by examples later).

28Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 46:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Convex functions

Relationship of functions with different degree of convexity:

h(x) is strongly convex

h(x) is convex

h(x) is pseudoconvex

h(x) is quasiconvex

h(x) is strictly convex

h(x) is strictly quasiconvex

29Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 47:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework

• Idea: The original problem (P ) is solved by solving a sequence ofsuccessively refined approximate problems.

Each of the approximate problems is much easier to solve than (P ), e.g.,closed-form solution, parallel and distributed implementation...

xxt

original function f(x)

approx. function f(x;xt)

30Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 48:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: update direction• Idea: The original problem (P ) is solved by solving a sequence ofsuccessively refined approximate problems.

Each of the approximate problems is much easier to solve than (P ), e.g.,closed-form solution, parallel and distributed implementation...

• Suppose the approximate function around xt is f(x; xt):

Bxt ∈ S(xt) ,x? ∈ X : f(x?; xt) = min

x∈Xf(x; xt)

.

• Example: gradient projection method

Bxt =[xt − st∇f(xt)

]X

= arg minx∈X

∥∥∥x− (xt − st∇f(xt))∥∥∥2

2

= arg minx∈X

∇f(xt)T (x− xt) + 1

2st∥∥∥x− xt

∥∥∥2

2︸ ︷︷ ︸,f(x;xt)

.

31Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 49:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: update direction• Suppose the approximate function around xt is f(x; xt):

Bxt ∈ S(xt) ,x? ∈ X : f(x?; xt) = min

x∈Xf(x; xt)

.

• The approximate function f(x; xt) satisfies some technical conditions(Yang and Pesavento, 2017):

(A1) Pseudoconvexity: f(x; xt) is pseudoconvex in x ∈ X for any givenxt ∈ X ;(A2) Gradient: f(x; xt) is continuously differentiable in x for any givenxt and ∇f(xt; xt) = ∇f(xt);(A3) Continuity: f(x; xt) is continuousin xt for a fixed x;

x

tangent at xt

xt

original function f(x)

Bxt

approx. function f(x;xt)

32Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 50:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: update direction

• The approximate function f(x; xt) satisfies some technical conditions:(A1) Pseudoconvexity: f(x; xt) is pseudoconvex in x ∈ X for any givenxt ∈ X ;(A2) Gradient: f(x; xt) is continuously differentiable in x for a fixed xtand ∇f(xt; xt) = ∇f(xt);(A3) Continuity: f(x; xt) is continuous in xt for a fixed x;

Proposition (Stationary point and descent direction)Provided that Assumptions (A1)-(A3) are satisfied:(i) A point xt is a stationary point of (P ) if and only if xt ∈ S(xt);(ii) If xt is not a stationary point of (P ), then Bxt − xt is a descentdirection of f(xt): ∇f(xt)T (Bxt − xt) < 0.

33Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 51:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: update direction• The approximate function f(x; xt) satisfies some technical conditions:

(A1) Pseudoconvexity: f(x; xt) is pseudoconvex in x ∈ X for any givenxt ∈ X ;(A2) Gradient: f(x; xt) is continuously differentiable in x for a fixed xtand ∇f(xt; xt) = ∇f(xt);(A3) Continuity: f(x; xt) is continuous in xt for a fixed x;

• A point xt is a stationary point of (P ) if and only if xt ∈ S(xt):xt ∈ S(xt) =⇒ xt is a stationary point of (P ):

xt = arg minx∈X

f(x; xt) =⇒ 0 ≤∇f(xt; xt)T (x− xt)︸ ︷︷ ︸first-order optimality condition

(A2)= ∇f(xt)T (x− xt)

xt is a stationary point of (P ) =⇒ xt ∈ S(xt):

0 ≤ ∇f(xt)T (x− xt) (A2)= ∇f(xt; xt)T (x− xt)(A1)=⇒ xt = arg minx∈X f(x; xt)

34Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 52:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: update direction

• The approximate function f(x; xt) satisfies some technical conditions:(A1) Pseudoconvexity: f(x; xt) is pseudoconvex in x ∈ X for any givenxt ∈ X ;(A2) Gradient: f(x; xt) is continuously differentiable in x for a fixed xtand ∇f(xt; xt) = ∇f(xt);(A3) Continuity: f(x; xt) is continuous in xt for a fixed x;

• If xt /∈ S(xt), Bxt − xt is a descent direction of f(x) at x = xt:

f(Bxt; xt) = minx∈X

f(x; xt) < f(xt; xt)︸ ︷︷ ︸optimality of Bxt

(A1)=⇒

0 >∇f(xt; xt)T (Bxt − xt) (A2)= ∇f(xt)T (Bxt − xt).

35Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 53:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: stepsize

• The variable xt+1 is updated according to:

xt+1 = xt + γt(Bxt − xt).

• Exact line search:

γt ∈ arg min0≤γ≤1

f(xt + γ(Bxt − xt)).

If f(x) is furthermore convex, then γt can be computed by bisection.• Successive line search: given two constants 0 < α, β < 1, γt = βmt

while mt is the smallest integer of m such that

f(xt + βm(Bxt − xt)) ≤ f(xt) + αβm∇f(xt)T (Bxt − xt).

36Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 54:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: convergenceAlgorithm: x0 ∈ X ; repeat the following steps until convergence:

• S1: Compute Bxt ∈ arg minx∈X f(x; xt).• S2: Compute γt by exact or successive line search.• S3: Update x: xt+1 = xt + γt(Bxt − xt) and t← t+ 1.

Theorem (Convergence to a stationary point)Assume Assumptions (A1)-(A3) as well as the following assumptions aresatisfied:

• (A4): The set S(xt) is nonempty for t = 1, 2, . . .;• (A5): Given any convergent subsequence xtt∈T whereT ⊆ 1, 2, . . ., the sequence

Bxt

t∈T is bounded.

Then f(xt) is a decreasing sequence (f(xt+1) < f(xt)) and any limitpoint of

xtis a stationary point of (P ) (but not a local maximum).

37Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 55:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: convergenceProof of convergence to a stationary point.To show Bx is a closed mapping: if xt → x and Bxt → y, then y = Bx.

The assumption on pseudoconvexity cannot be further relaxed.• Consider the following optimization problem:

minimize x3

subject to − 1 ≤ x ≤ 1.• Consider the following approximate problem at xt = 0:

Bxt = arg min−1≤x≤1

x3 = −1,

0 = xt 6= arg min−1≤x≤1

x3 = −1.

• However, Bxt − xt is not a descent direction:(Bxt − xt) · ∇f(xt) = (−1− 0) · 0 = 0.

38Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 56:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: convergence• In some existing iterative algorithms, the approximate function must bea global upper bound of the original function f(xt), i.e.,f(x; xt) ≥ f(x) and f(xt; xt) = f(xt):

BSUM, see Razaviyayn, Hong, and Luo (2013);majorization minimization, see Sun, Babu, and Palomar (2017);sequential programming, see Zappone, Björnson, Sanguinetti, andJorswieck (2017).

• Advantages of such an approximate function:A constant unit stepsize can be chosen:

xt+1 = xt + 1 · (Bxt − xt) = Bxt = arg minx∈X

f(x; xt).

• Disadvantages of such an approximate function:It does not always exists;It may be difficult to be optimized (illustrated later by examples);A constant unit stepsize may not be the optimal stepsize.

39Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 57:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: convergence• Note that the approximate function f(x; xt) specified in (A1)-(A3) doesnot have to be a global upper bound of the original function f(x), i.e.,f(x; xt) f(x).

• Advantages: more flexibility to construct the approximate function.• Disadvantages: more complexity to calculate the stepsize by the linesearch.

x

tangent at xt

xt

original function f(x)

Bxt

approx. function f(x;xt)

40Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 58:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: convergenceIf, in addition, (A6) f(x; xt) ≥ f(x) and f(xt; xt) = f(xt),the following constant stepsize can be adopted:

γt = 1,

because it yields sufficient decrease on f(x) along Bxt − xt:

f(xt + 1 · (Bxt − xt)) = f(Bxt)(A6) upper bound property→ ≤ f(Bxt; xt)

optimality of Bxt → = minx∈X

f(x; xt)

perform line search on f(x; xt)→ ≤ f(xt + βmt(Bxt − xt); xt)definition of line search→≤f(xt; xt)+αβmt∇f(xt; xt)T (Bxt − xt)

(A6) upper bound property→ = f(xt) + αβmt∇f(xt)T (Bxt − xt).

• However, γt = 1 may not be the optimal stepsize.

41Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 59:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: special cases• Conditional gradient method (Bertsekas, 2016):

Bxt = arg minx∈X

∇f(xt)T (x− xt)︸ ︷︷ ︸

=f(x;xt), linear(convex)

.

• Gradient projection method (Bertsekas, 2016):

Bxt =[xt − s∇f(xt)

]X

= arg minx∈X

∥∥∥x− (xt − s∇f(xt))∥∥∥2

2

= arg minx∈X

∇f(xt)T (x− xt) + 1

2s

∥∥∥x− xt∥∥∥2

2︸ ︷︷ ︸=f(x;xt), strongly convex

.

Assumption verification:

∇x(f(x; xt)

)∣∣x=xt = ∇f(xt) + 1

s(x− xt)

∣∣x=xt = ∇f(xt)

42Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 60:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: special casesProximal point algorithm (Parikh and Boyd, 2014):

• The approximate problem can be formulated as follows:

minimizex∈X

f(x; xt) , f(x) + c

2

∥∥∥x− xt∥∥∥2

2.

• The approximate function is an upper bound function:

f(x; xt) = f(x) + c

2

∥∥∥x− xt∥∥∥2

2≥ f(x).

• f(x; xt) is convex if c+ λmin(∇2f(x)) ≥ 0 for all x ∈ X .Proximal gradient algorithm (Parikh and Boyd, 2014):

• The approximate problem is

minimizex∈X

f(x; xt) , f(xt) +∇f(xt)T (x− xt) + c

2

∥∥∥x− xt∥∥∥2

2.

• f(x; xt) is convex and f(x; xt) ≥ f(x) if ∇f(x) is Lipschitz continuous.

43Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 61:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: special casesJacobi algorithm:

• Consider the following optimization problem:

minimizex=(xk)K

k=1

f(x1, . . . ,xK)

subject to x ∈ X ,⇐ nonconvex in x

where f(x) is individually convex in xk for all k, but not jointly in x.• The approximate problem is (Yang and Pesavento, 2017)

minimizex

f(x; xt) =∑Kk=1 f(xk,xt−k)︸ ︷︷ ︸

convex in xksubject to x ∈ X ,

⇐ convex in x

where x−k , (xj)j 6=k.

• Assumption verification:

∇xk(∑K

j=1 f(xj ,xt−j))∣∣

x=xt

= ∇xk(f(xk,xt−k) +

∑j 6=k f(xj ,xt−j)

)∣∣x=xt

= ∇xkf(xk,xt−k)∣∣x=xt

= ∇kf(xt).

44Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 62:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: special casesJacobi algorithm:

• Consider the following optimization problem:

minimizex=(xk)K

k=1

f(x1, . . . ,xK)

subject to x ∈ X ,⇐ nonconvex in x

where f(x) is individually convex in xk for all k, but not jointly in x.

• The approximate problem is (Yang and Pesavento, 2017)

minimizex

f(x; xt) =∑Kk=1 f(xk,xt−k)︸ ︷︷ ︸

convex in xksubject to x ∈ X ,

⇐ convex in x

where x−k , (xj)j 6=k.

• Assumption verification:

∇xk(∑K

j=1 f(xj ,xt−j))∣∣

x=xt

= ∇xk(f(xk,xt−k) +

∑j 6=k f(xj ,xt−j)

)∣∣x=xt

= ∇xkf(xk,xt−k)∣∣x=xt

= ∇kf(xt).

44Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 63:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: special casesJacobi algorithm:

• The approximate problem is (Yang and Pesavento, 2017)

minimizex

f(x; xt) =∑Kk=1 f(xk,xt−k)

subject to x ∈ X .

• If X = X1 × . . .×XK , the approximate problem can be decomposedinto K smaller optimization problems (Jacobi algorithms):

minimize(xk∈Xk)K

k=1

∑Kk=1 f(xk,xt−k)⇒

minimize

x1∈X1f(x1,xt−1)

...minimize

xK∈XKf(xK ,xt−K)

• The stepsize is determined by the line search.• The Jacobi (best-response) algorithm exhibits fast convergence speed.• Convergence conditions could be further relaxed (shown by examples)!

45Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 64:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: special casesJacobi algorithm:

• If X = x :∑Kk=1 hk(xk) ≤ 0, the approximate problem is

minimizex

f(x; xt) =∑Kk=1 f(xk,xt−k)

subject to∑Kk=1 hk(xk) ≤ 0.

• It can be solved by the primal/dual decomposition techniques.• For example, in dual decomposition, the Lagrangian is

L(x,µ; xt) =∑Kk=1f(xk,xt−k) + µT

(∑Kk=1hk(xk)

)=∑Kk=1

(f(xk,xt−k) + µThk(xk)

).

• The dual function d(µ) is

d(µ) = minx=(xk)K

k=1

L(x,µ; xt)⇒ minxk

f(xk,xt−k) + µThk(xk)

,∀k.

• The dual variable µ is updated by the subgradient method (bisectionmethod if µ is a scalar); illustrated later by MIMO BC more details .

46Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 65:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Dual variable update using subgradientDual variable update in inner iteration with iteration index m.

• Initialization: Choose initial dual variable µ(0).• Step 1: Compute

Bxkt(µ(m)) ∈ arg min

x=(xk)Kk=1

f(xk,xt−k) + µ(m)Thk(xk)

,∀k.

• Note that∑Kk=1 hk(Bxtk(µ(m))) is subgradient of the dual function at

µ(m) (Bertsekas, 2016, Sec. 7.1).• Step 2: Update dual variable with subgradient:

µ(m+1) = µ(m) + α(m)K∑k=1

hk(Bxtk(µ(m))),

where α(m) is a properly chosen decreasing stepsize.• Step 3: m← m+ 1 and go back to Step 1.

47Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 66:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: special casesPartial linearization method (Scutari, Facchinei, Song, Palomar, and Pang,2014).

• In a multi-user communication network, the cost function for user k isfk(xk,x−k) (x−k = (xj)j 6=k) and it is convex in xk only.

• Minimize the sum-cost function:

minimizex=(xk)K

k=1

K∑k=1

fk(xk,x−k)

subject to xk ∈ Xk, k = 1, . . . ,K.

• Let us have a look at the objective function from xk’s perspective:

K∑j=1

fj(x) = fk(xk,x−k)︸ ︷︷ ︸convex in xk

+∑j 6=k

fj(xj ,x−j)︸ ︷︷ ︸nonconvex in xk

.

48Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 67:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: special casesPartial linearization method (Scutari, Facchinei, Song, Palomar, and Pang,2014).

• The approximate problem can be formulated as follows:

minimize(xk∈Xk)K

k=1

f(x; xt) ,∑Kk=1fk(xk; xt)

where

fk(xk; xt) = fk(xk,xt−k)︸ ︷︷ ︸convex

+⟨xk − xtk,

∑j 6=k∇xkfj(x

t)⟩︸ ︷︷ ︸

linear (convex)

.

• Assumption verification:

∇xk∑Kj=1 fj(xj ; xt)

∣∣x=xt = ∇xk fk(xtk; xt) +

∑j 6=k∇xk fj(xtj ; xt)

= ∇xkfk(xt) +∑j 6=k∇xkfj(xt)

= ∇xk(∑Kj=1fj(x))

∣∣x=xt .

49Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 68:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: special casesPartial linearization method (Scutari, Facchinei, Song, Palomar, and Pang,2014).

• The approximate problem can be formulated as follows:

minimize(xk∈Xk)K

k=1

f(x; xt) ,∑Kk=1fk(xk; xt)

where

fk(xk; xt) = fk(xk,xt−k)︸ ︷︷ ︸convex

+⟨xk − xtk,

∑j 6=k∇xkfj(x

t)⟩︸ ︷︷ ︸

linear (convex)

.

• The approximate problem can be solved in parallel among users:minimize

x1∈X1f1(x1; xt)

...minimize

xK∈XKfK(xK ; xt)

50Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 69:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: other casesThe block successive upper-bound minimization (BSUM) method(Razaviyayn et al., 2013) cannot be analyzed by the introduced framework.

• The approximate problem at iteration t is:

minimizexk∈Xk

f(xk; xt),

where k = mod(t− 1,K) + 1,

f(xk; xt) ≥ f(xk,xt−k),f(xtk; xt) = f(xtk,xt−k),

∇kf(xk; xt)∣∣x=xt = ∇kf(xk,xt−k)

∣∣x=xt .

• It does not satisfy the assumption on equal gradients (A2):

∇kf(xk; xt)∣∣x=xt = ∇kf(xk,xt−k)

∣∣x=xt ,

∇j f(xk; xt)∣∣x=xt = 0 6= ∇jf(x)

∣∣x=xt ,∀j 6= k.

51Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 70:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Outline

1 TheoryProblem formulation and motivating examplesDescent direction and stepsizeFunctions with different level of convexityDesign of approximate functions

2 ApplicationsMIMO MAC capacity maximizationMIMO BC capacity maximizationGlobal energy efficiency maximization in MIMO systemsSum energy efficiency maximization in MIMO systemsNondifferentiable problems and LASSOMD rank sparse regularization for MIMO channel estimation

52Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 71:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

MIMO MAC capacity maximization: sequential algorithm• Suppose Hk is the channel from user k to the base station.• The sum capacity of MIMO MAC is

maximizeQk

log∣∣I +

∑Kk=1HkQkHH

k

∣∣subject to Qk 0, tr(Qk) ≤ Pk, k = 1, . . . ,K.

• The famous iterative (sequential) water-filling algorithm (Yu, Rhee,Boyd, and Cioffi, 2004) is essentially the block coordinate descent(BCD) algorithm (Bertsekas, 2016, Sec. 3.7):

Qt+11 = arg max

tr(Q1)≤P1

log∣∣I + H1Q1HH

1 +∑Kj=2HjQt

jHHj

∣∣Qt+1

2 = arg maxtr(Q2)≤P2

log∣∣I + H1Qt+1

1 HH1 + H2Q2HH

2 +∑Kj=3HjQt

jHHj

∣∣↓

Qt+1K = arg max

tr(QK)≤PKlog∣∣I +

∑K−1j=1 HjQt+1

j HHj + HKQKHH

K

∣∣53

Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 72:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

MIMO MAC capacity maximization: parallel algorithm• Suppose Hk is the channel from user k to the base station.

The sum capacity of MIMO MAC is

maximizeQk

log∣∣I +

∑Kk=1HkQkHH

k

∣∣subject to Qk 0, tr(Qk) ≤ Pk, k = 1, . . . ,K.

• Is there an iterative simultaneous water-filling algorithm? Yes.• The approximate problem at the t-th iteration is

maximizeQkKk=1

∑Kk=1 log

∣∣I +∑j 6=kHjQt

jHHj + HkQkHH

k

∣∣subject to Qk 0, tr(Qk) ≤ Pk, k = 1, . . . ,K.

• The approximate problem at t-th iteration is

maximize

Q10,tr(Q1)≤P1log∣∣I +

∑j 6=1HjQt

jHHj + H1Q1HH

1∣∣

...maximize

QK0,tr(QK)≤PKlog∣∣I +

∑j 6=KHjQt

jHHj + HKQKHH

K

∣∣• It has a closed-form solution based on water-filling over noise and“interference” (pre-whitening): parallel implementation.

54Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 73:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

MIMO MAC capacity maximization: parallel algorithm• Suppose Hk is the channel from user k to the base station.

The sum capacity of MIMO MAC is

maximizeQk

log∣∣I +

∑Kk=1HkQkHH

k

∣∣subject to Qk 0, tr(Qk) ≤ Pk, k = 1, . . . ,K.

• Is there an iterative simultaneous water-filling algorithm? Yes.

• The approximate problem at the t-th iteration is

maximizeQkKk=1

∑Kk=1 log

∣∣I +∑j 6=kHjQt

jHHj + HkQkHH

k

∣∣subject to Qk 0, tr(Qk) ≤ Pk, k = 1, . . . ,K.

• The approximate problem at t-th iteration is

maximize

Q10,tr(Q1)≤P1log∣∣I +

∑j 6=1HjQt

jHHj + H1Q1HH

1∣∣

...maximize

QK0,tr(QK)≤PKlog∣∣I +

∑j 6=KHjQt

jHHj + HKQKHH

K

∣∣• It has a closed-form solution based on water-filling over noise and“interference” (pre-whitening): parallel implementation.

54Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 74:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Stepsize computation based on exact line search

• Recall update procedure and exact line search:

xt+1 = xt + γt︸︷︷︸stepsize

· (Bxt − xt)︸ ︷︷ ︸update direction

γt ∈ arg min0≤γ≤1

f(xt + γ(Bxt − xt)).

• With Qt+1k = Qt

k + γt(BQtk −Qt

k) the exact line search consists insolving the convex optimization problem:

γt = arg max0≤γ≤1

log∣∣I +

∑Kk=1Hk

(Qtk + γ(BQt

k −Qtk))HHk

∣∣• This concave problem can be solved efficiently using bisection method(linear convergence).

55Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 75:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Stepsize computation based on exact line search• Alternative: Use efficient customized algorithm. (Bunch, Nielsen, andSorensen (1978))

γt = arg max0≤γ≤1

log∣∣I +

∑Kk=1HkQt

kHHk︸ ︷︷ ︸

A

+γ∑Kk=1Hk(BQt

k −Qtk)HH

k︸ ︷︷ ︸B

∣∣• Define λ1, . . . , λr as the generalized eigenvalues of pair (B,A)partitioned in negative and positive sets λ1, . . . , λr′ andλr′+1, . . . , λr, respectively.

γt = arg max0≤γ≤1

r∑i=1

log(1 + γλi)

• Results in numerically rootingr∑i=1

λi(1 + γλi)

= 0⇒r′∑i=1

1(1/|λi| − γ) =

r∑i=r′+1

1(1/|λi|+ γ)

56Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 76:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Stepsize computation based on exact line search

ψ(γ) ,r′∑i=1

1(1/|λi| − γ) =

r∑i=r′+1

1(1/|λi|+ γ) , φ(γ)

• Iteratively approximate each side by simple rational functions and rootresulting quadratic function.

ψp,q(γ) , p

q − γ= u+ v

1/|λr′+1| − γ, φu,v(γ)

• At iteration τ + 1 and point γ(τ) choose p and q such that

ψp,q(γ)∣∣γ=γ(τ) = ψ(γ)

∣∣γ=γ(τ) , ∇ψp,q(γ)

∣∣γ=γ(τ) = ∇ψ(γ)

∣∣γ=γ(τ) .

• Similarly choose u and v such that

φu,v(γ)∣∣γ=γ(τ) = φ(γ)

∣∣γ=γ(τ) , ∇φu,v(γ)

∣∣γ=γ(τ) = ∇φ(γ)

∣∣γ=γ(τ) .

• Algorithm has quadratic convergence!

57Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 77:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Stepsize computation based on exact line search

0.2 0.3 0.4 0.5 0.6 0.70

2

4

6

γ

Rational Approximation at γ(τ) = 0.2

ψ(τ)(γ)ψ

(τ)p,q (γ)

φ(τ)(γ)φ

(τ)u,v(γ)

58Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 78:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

MIMO MAC capacity maximization• Rayleigh channel, SNR=10dB.• The number of Tx and Rx antennas: 5 and 10.• The number of users: K = 10 and K = 100.

0 10 20 30 40 500

10

20

30

40

50

60

70

80

iteration

sum

cap

acity

(na

ts/s

)

Number of users: K=100Number of users: K=10

• The algorithm scales well with the number of users K.• Complexity of the line search does not depend on the number of usersK.

59Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 79:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

MIMO BC capacity maximization• Suppose Hk is the channel from the base station to user k.• The sum capacity of MIMO BC is (based on the downlink-uplinkduality)

maximizeQk

log∣∣I +

∑Kk=1HH

k QkHk

∣∣subject to Qk 0, k = 1, . . . ,K,

∑Kk=1tr(Qk) ≤ P.

• The approximate problem at t-th iteration is

maximizeQk

∑Kk=1 log

∣∣I +∑j 6=kHH

j QtjHj + HH

k QkHk

∣∣subject to Qk 0, k = 1, . . . ,K,

∑Kk=1tr(Qk) ≤ P︸ ︷︷ ︸

coupling constraints, but seperable

.

• A parallel implementation is possible based on dual decompositionmore details .

60Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 80:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

MIMO BC capacity maximization• The Lagrangian is

L(Q, µ) =∑Kk=1 log

∣∣I +∑j 6=kHH

j QtjHj + HH

k QkHk

∣∣+ µ

(∑Kk=1tr(Qk)− P

).

• The dual function is d(µ) = max(Qk0)Kk=1

L(Q, µ) :

maxQ10

log∣∣I +

∑j 6=1HH

j QtjHj + HH

1 Q1H1∣∣+ µtr(Q1)

maxQ20

log∣∣I +

∑j 6=2HH

j QtjHj + HH

2 Q2H2∣∣+ µtr(Q2)

...max

QK0log∣∣I +

∑j 6=KHH

j QtjHj + HH

KQKHK

∣∣+ µtr(QK)

• It has a closed-form solution based on water-filling.• Optimal dual variable µ? can be found efficiently by the bisectionmethod.

61Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 81:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

MIMO BC capacity maximization

• State-of-the-art: the iterative algorithm in Jindal, Rhee, Vishwanath,Jafar, and Goldsmith (2005):

Same approximate function, but with a constant stepsize: γt = 1/K.

0 20 40 60 80 1000

2

4

6

8

10

12

14

16

number of iterations

sum

rat

e (n

ats/

s)

sum capacity (benchmark)parallel update with fixed stepsize (state−of−the−art)parallel update with exact line search (proposed)

# of users: 100

# of users: 20

• Complexity per iteration: 0.0023s water-filling+ 0.0018s exact linesearch

62Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 82:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Global energy efficiency maximization in MIMO systems

• MIMO multi-cell network with K cells.• Each cell serves a single user in the downlink.• Qk: transmit covariance matrix of user k.• σ2

k: noise covariance at user k.• Hkjk,j : channel from BS j to user k.• Inter-cell interference is treated as noise.• ∑j 6=k HkjQjHH

kj : interference covariance at user k.• Achievable rate for link k (downlink):

rk(Q) , log∣∣I +

(σ2kI +

∑j 6=kHkjQjHH

kj

)−1HkkQkHHkk

∣∣.rk(Q) is concave in Qk, but not jointly concave in Q = (Qj)Kj=1.

63Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 83:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Global energy efficiency maximization in MIMO systems• Global Energy efficiency (GEE) of the network: Ratio of sumtransmission rate divided by sum power consumption.

maximizeQ

∑Kk=1 rk(Q)∑K

k=1(P0,k + ρktr(Qk)), f(Q) ← nonconcave

subject to Qk 0, tr(Qk) ≤ Pk,∀k.

• P0,k: the power consumption at the zero RF output power.• ρk: the slope of the load dependent power consumption.• Pk: the maximum transmission power.• The GEE maximization problem is nonconvex and NP-hard (even whenmaximizing the sum rate in the numerator, see Luo and Zhang (2008)).

• For the same problem, there exist different approximate functions:The choice of approximate function depends on the problem structure.Different choices of approximate functions lead to different algorithms.

64Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 84:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Global energy efficiency maximization in MIMO systemsOne choice of the approximate function is based on lower boundmaximization, see Zappone, Björnson, Sanguinetti, and Jorswieck (2017).

• Achievable rate for link k (downlink):

rk(Q) = log∣∣I +

(σ2kI +

∑j 6=kHkjQjHH

kj

)−1HkkQkHHkk

∣∣= log

∣∣σ2kI +

∑Kj=1HkjQjHH

kj

∣∣︸ ︷︷ ︸,r+

k(Q), concave

− log∣∣σ2kI +

∑j 6=kHkjQjHH

kj

∣∣︸ ︷︷ ︸,r−

k(Q), concave

• Numerator function approximation∑Kk=1 r

(a)k (Q; Qt) (choice (a)):

r(a)k (Q; Qt) , r+

k (Q)︸ ︷︷ ︸concave in Q

− r−k (Qt)︸ ︷︷ ︸constant

−∑j 6=k (Qj −Qt

j) • ∇jr−k (Qt)︸ ︷︷ ︸linear in Qk

,

where A •B , <(Tr(AHB)).• ∑K

k=1r(a)k (Q;Qt) is a global lower bound of

∑Kk=1rk(Q):

r(a)k (Q;Qt) ≤ rk(Q).

65Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 85:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Global energy efficiency maximization in MIMO systemsIn choice (a) the approximate function at Q = Qt is chosen as:

f (a)(Q; Qt) = r(a)(Q; Qt)P (Q) ,

∑Kk=1 r

(a)k (Q; Qt)∑K

k=1(P0,k + ρktr(Qk)).

• Numerator function:r

(a)k (Q; Qt) , r+

k (Q)︸ ︷︷ ︸concave in Q

− r−k (Qt)︸ ︷︷ ︸constant

−∑j 6=k (Qj −Qt

j) • ∇jr−k (Qt)︸ ︷︷ ︸linear in Q︸ ︷︷ ︸

concave in Q

.

• Denominator function:P (Q) ,

∑Kj=1(P0,k + ρktr(Qk))︸ ︷︷ ︸

convex in Q

.

• f (a)(Q; Qt) is a global lower bound of f(Q) with equality at Q = Qt:f (a)(Q; Qt) ≤ f(Q) and f (a)(Qt; Qt) = f(Qt).

66Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 86:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Global energy efficiency maximization in MIMO systems

For approximate function (a) and defining r(a)(Q; Qt) ,∑Kk=1 r

(a)k (Q; Qt)

the following properties hold:

equal function value at Qt: r(a)(Q; Qt)∣∣Q=Qt = r(Q)

∣∣Q=Qt

equal gradient at Qt: ∇r(a)(Q; Qt)∣∣Q=Qt = ∇r(Q)

∣∣Q=Qt

such that

equal function value at Qt:f (a)(Q; Qt)∣∣Q=Qt = r(a)(Q; Qt)

P (Q)∣∣Q=Qt = f(Qt),

quotient rule:∇f (a)(Q; Qt) = ∇r(a)(Q; Qt)P (Q)− r(a)(Q; Qt)∇P (Q)

P 2(Q) ,

equal gradient at Qt:∇f (a)(Q; Qt)∣∣Q=Qt = ∇f(Qt),

67Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 87:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Global energy efficiency maximization in MIMO systemsWe define the approximate function at Q = Qt as:

f (a)(Q; Qt) =∑Kk=1 r

(a)k (Q; Qt)← concave in Q∑K

k=1(P0,k + ρktr(Qk))← convex in Q︸ ︷︷ ︸pseudoconcave in Q

.

• The ratio of a concave function and a convex function is generally notconcave.

• However, f (a)(Q; Qt), the ratio of a nonnegative concave function anda positive convex function, is pseudoconcave (A1).

• The lower bound assumption (A6) is satisfied and a constant unitstepsize can be used:

γt = 1 =⇒ Qt+1 = arg max(Qk0,tr(Qk)≤Pk)K

k=1

f (a)(Q; Qt).

• A general-purpose optimization solver should be used to solve theapproximate problem, which may be computationally expensive.

68Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 88:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Global energy efficiency maximization in MIMO systemsAnother choice of the approximate function, denoted as choice (b), is basedon the best-response (Yang and Pesavento, 2017).

• Given the original function f(Q) as:

f(Q) =∑Kk=1 rk(Q)P (Q) .

• We first define the approximate function at Q = Qt as:

f (b)(Q; Qt) =∑Kk=1 r

(b)k (Qk; Qt

−k)P (Q) .

• Design approximate numerator functionsr

(b)k (Qk; Qt

−k) = rk(Qk,Qt−k) +

∑Kj 6=k rj(Qk,Qt

−k) to approximate∑Kk=1 rk(Q) for variable Qk and for k = 1, . . . ,K using Jacobi-type

approach.

69Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 89:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Global energy efficiency maximization in MIMO systems• Linearize non-concave part of approx. numerator funct. r(b)

k (Qk;Qt)such that equal grad. cond. (A2): ∇f (b)(Qt; Qt)=∇f (b)(Qt) holds:

r(b)k (Qk; Qt) , rk(Qk,Qt

−k)︸ ︷︷ ︸concave in Qk

+ (Qk −Qtk) •

∑j 6=k∇krj(Qt

k,Qt−k)︸ ︷︷ ︸

linear in Qk︸ ︷︷ ︸concave in Qk

,

• From quotient rule:

∇f (b)(Q; Qt) = ∇r(b)(Q; Qt)P (Q)− r(b)(Q; Qt)∇P (Q)

P 2(Q) .

• The following zero- and first-order conditions must hold:

r(b)k (Qk; Qt

−k)∣∣Q=Qt = rk(Q)

∣∣Q=Qt ,

∇kr(b)k (Qk; Qt

−k)∣∣Q=Qt = ∇k

(∑Kj=1rj(Q)

)∣∣Q=Qt .

70Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 90:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Global energy efficiency maximization in MIMO systems

Choice (b) defines the approximate function at Q = Qt as:

f (b)(Q; Qt) =∑Kk=1 r

(b)k (Qk; Qt)← concave in Q∑K

k=1(P0,k + ρktr(Qk))← convex in Q︸ ︷︷ ︸pseudoconcave in Q

6= f (a)(Q; Qt).

• The ratio of a concave function and a convex function is generally notconcave.

• However, f(Q; Qt), the ratio of a nonnegative concave function and apositive convex function, is pseudoconcave.

• Stepsize: successive line search (GEE function is nonconcave).• Fast convergence: problem structure is preserved as much as possible,while making problem separable.

71Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 91:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Global energy efficiency maximization in MIMO systems

We define the approximate function at Q = Qt as:

BQt = arg max(Qk0,tr(Qk)≤Pk)K

k=1

∑Kk=1 r

(b)k (Qk; Qt)∑K

k=1(P0,k + ρktr(Qk)).

• It represents a fractional programming problem and can be solved by theDinkelbach’s algorithm:

arg max(Qk0,tr(Qk)≤Pk)K

k=1

K∑k=1

r(b)k (Qk; Qt)− λ

K∑k=1

(P0,k + ρktr(Qk)).

• This problem is separable among different variables (Qk)Kk=1.

72Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 92:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Global energy efficiency maximization in MIMO systemsWe define the approximate function at Q = Qt as:

BQt = arg max(Qk0,tr(Qk)≤Pk)K

k=1

∑Kk=1 r

(b)k (Qk; Qt)∑K

k=1(P0,k + ρktr(Qk)).

• It represents a fractional programming problem and can be solved by theDinkelbach’s algorithm:

arg maxQ10,tr(Q1)≤P1

r(b)1 (Q1; Qt)− λ(P0,1 + ρ1tr(Q1))

...arg max

QK0,tr(QK)≤PKr

(b)K (QK ; Qt)− λ(P0,K + ρKtr(QK))

This solution can be computed in closed-form.• Easy implementation: closed-form expression + parallel decomposition.

73Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 93:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Global energy efficiency maximization in MIMO systems

at τ = 0: λt,0 = 0 (or other value).Step 1: Given λt,τ at iteration τ + 1 solve the following concave problem:

Qk(λt,τ ) , arg maxQk0,tr(Qk)≤Pk

r(b)k (Qk; Qt)− λt,τ (P0,k + ρktr(Qk)).

This solution can be computed in closed-form.Step 2: The variable λt,τ is then updated in iteration τ + 1 as

λt,τ+1 =∑Kk=1 r

(b)k (Qk(λt,τ ); Qt)∑K

k=1 P0,k + ρktr(Qk(λt,τk )).

• The Dinkelbach’s algorithm converges limτ→∞Q(λt,τ ) = BQt atsuperlinear rate.

74Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 94:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Global energy efficiency maximization in MIMO systems

time (sec)0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

nats

/Hz/

s/Jo

ul

200

300

400

500

600

best-response with successive line search (proposed)successive lower bound maximization (state-of-the-art)

0 0.02 0.04400

500600

number of iterations0 5 10 15 20 25 30 35 40 45 50

nats

/Hz/

s/Jo

ul

200

300

400

500

600

best-response with successive line search (proposed)successive lower bound maximization (state-of-the-art)

Figure: Energy Efficiency Maximization: achieved energy efficiency versus thenumber of iterations, Number of Tx antennas: M = 50, Number of cells: K = 10State-of-the-art: A. Zappone, L. Sanguinetti, G. Bacci, E. Jorswieck, and M. Debbah, “Energy-efficient power control: A look at 5Gwireless technologies,” IEEE Transactions on Signal Processing, vol. 64, no. 7, pp. 1668-1683, April 2016.

75Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 95:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Global energy efficiency maximization in MIMO systems

time (sec)0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

nats

/Hz/

s/Jo

ul

200

400

600800

best-response with successive line search (proposed)successive lower bound maximization (state-of-the-art)

0 0.1 0.2

500600700800

number of iterations0 5 10 15 20 25 30 35 40 45 50

nats

/Hz/

s/Jo

ul200

400

600

800

best-response with successive line search (proposed)successive lower bound maximization (state-of-the-art)

Figure: Energy Efficiency Maximization: achieved energy efficiency versus thenumber of iterations, Number of Tx antennas: M = 50, Number of cells: K = 50State-of-the-art: A. Zappone, L. Sanguinetti, G. Bacci, E. Jorswieck, and M. Debbah, “Energy-efficient power control: A look at 5Gwireless technologies,” IEEE Transactions on Signal Processing, vol. 64, no. 7, pp. 1668-1683, April 2016.

76Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 96:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Global energy efficiency maximization in MIMO systems• Global Energy efficiency (GEE): Ratio of sum transmission rate dividedby sum power consumption

maximizeQ

f(Q) ,∑Kk=1 rk(Q)∑K

k=1(P0,k + ρktr(Qk))subject to Qk 0, tr(Qk) ≤ Pk, ∀k,

rk(Q) ≥ Rk, ∀k.← QoS constraints

• P0,k: the power consumption at the zero RF output power.• ρk: the slope of the load dependent power consumption.• Pk: the maximum transmission power.• Rk: minimum guaranteed transmission rate for user k.• Iterative algorithm based on the approximation of

the nonconcave objective function: pseudoconvex approximation;the nonconvex constraint set: inner approximation.

77Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 97:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Sum energy efficiency maximization in MIMO systems

• MIMO multi-cell network with K cells.• Each cell serves a single user in the downlink.• Qk: transmit covariance matrix of user k.• σ2

k: noise covariance at user k.• Hkjk,j : channel from BS j to user k.• Inter-cell interference is treated as noise.• ∑j 6=k HkjQjHH

kj : interference covariance at user k.• Achievable rate for link k (downlink):

rk(Q) , log∣∣I +

(σ2kI +

∑j 6=kHkjQjHH

kj

)−1HkkQkHHkk

∣∣= log

∣∣σ2kI +

∑Kj=1HkjQkHH

kj

∣∣− log∣∣σ2kI +

∑j 6=kHkjQjHH

kj

∣∣.rk(Q) is concave in Qk, but not jointly concave in Q = (Qj)Kj=1.

78Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 98:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Sum energy efficiency maximization in MIMO systems

• Sum Energy efficiency (SEE): Sum of individual energy efficiency.

maximizeQ

f(Q) ,K∑k=1

rk(Q)P0,k + ρktr(Qk)

← nonconcave

subject to Qk 0, tr(Qk) ≤ Pk, ∀k.

• P0,k: the power consumption at the zero RF output power.• ρk: the slope of the load dependent power consumption.• Pk: the maximum transmission power.• The SEE maximization problem is nonconvex and generally difficult tosolve (NP-hard).

79Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 99:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Sum energy efficiency maximization in MIMO systemsWe derive approximate function at Q = Qt using Jacobi method as:

f (c)(Q; Qt) =K∑j=1

(rj(Qj ,Qt

−j)Pj(Qj)︸ ︷︷ ︸

pseudoconcave

+K∑k 6=j

rj(Qk,Qt−k)

Pj(Qtj)︸ ︷︷ ︸

non (pseudo)concave︸ ︷︷ ︸not pseudoconcave

).

Problem: Approx. function f (c)(Q; Qt) is generally not pseudoconvex.Straightforward idea: Linearize numerator of non-pseudoconcave term.

f (c′)(Q; Qt)=K∑k=1

(rk(Qk,Qt

−k)Pk(Qk)︸ ︷︷ ︸

pseudoconvex

+K∑j 6=k

rj(Qt)+(Qk−Qtk)•∇krj(Qt

k,Qt−k)

Pj(Qtj)︸ ︷︷ ︸

linear

).

Problem: Sum of pseudoconcave and linear function is generally notpseudoconcave.

80Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 100:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Sum energy efficiency maximization in MIMO systemsSolution: Introduce common denominator before linearizing numerator.

f (c)(Q,Qt) =K∑k=1

(rk(Qk,Qt

−k)Pk(Qk)︸ ︷︷ ︸

pseudoconcave

+K∑j 6=k

rj(Qk,Qt−k)Pk(Qk)

Pj(Qtj)Pk(Qk)︸ ︷︷ ︸

linearize numerator

).

Linearizing the numerator the approximate function becomes pseudoconvex:

f (c)(Q; Qt) =K∑k=1

f(c)k (Qk; Qt),

where f (c)k (Qk; Qt) is a nonnegative concave over positive convex (thus

pseudoconcave!) function:

f(c)k (Qk;Qt)=

rk(Qk,Qt−k)

Pk(Qk) +K∑j 6=k

rj(Qt)Pk(Qtk)+(Qk−Qt

k)•∇k(rj(Qk,Qt)Pk(Qk)

)∣∣Qk=Qt

k

Pj(Qtj)Pk(Qk) .

81Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 101:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Sum energy efficiency maximization in MIMO systems

• Equal gradient condition at Q =Qt:

∇kf(Q; Qt)∣∣Q=Qt = ∇kf(Q)

∣∣Q=Qt .

• Follows directly from Jacobi approach and partial linearization.• The approximate problem is:

maximize(Qk)K

k=1

K∑k=1

f(c)k (Qk; Qt)︸ ︷︷ ︸

pseudoconcave in Qk︸ ︷︷ ︸not necessarily pseudoconcave in Q=(Qk)K

k=1

subject to Qk 0, tr(Qk) ≤ Pk, k = 1, . . . ,K.

• The sum of pseudoconcave functions is not necessarily pseudoconcave.

82Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 102:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Sum energy efficiency maximization in MIMO systems

• The approximate problem can be decomposed into K subproblems:

=⇒

maximize

Q10,tr(Q1)≤P1f

(c)1 (Q1; Qt)

...maximize

QK0,tr(QK)≤PKf

(c)K (QK ; Qt)

• Each subproblem is pseudoconcave!

BkQt = arg maxQk0,tr(Qk)≤Pk

f(c)k (Qk; Qt), k = 1, . . . ,K.

=⇒ f(c)k (BkQt

k; Qt) > f(c)k (Qt

k; Qt).

83Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 103:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Sum energy efficiency maximization in MIMO systems

• A descent direction for the original problem is obtained as:

0 < (BkQt−Qtk)•∇k

(f

(c)k (Qk; Qt)

)∣∣Qk=Qt

k= (BkQt−Qt

k)•∇kf(Qt).

• Adding up over k = 1, . . . ,K, we obtain (BQt −Qt) • ∇f(Qt) > 0,where BQt = (BkQt)Kk=1.

• Stepsize: successive line search.

The proposed algorithm has the following advantages:• Easy implementation: closed-form solution + parallel decomposition.• Fast convergence: problem structure is preserved as much as possible,while decoupling into subproblems.

• Guaranteed convergence: despite the fact that approximate problem isnot even pseudoconvex.

84Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 104:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Sum energy efficiency maximization in MIMO systems

• The number of Tx and Rx antennas: 4 and 4.• The number of cells: K = 5 and 10.• Noise covariance σ2 = 1, and power budget Pk = 36dBm.

number of iterations0 5 10 15 20 25 30 35 40 45 50

bits

/Jou

le

0

0.5

1

1.5

Number of cells: 5Number of cells: 10

85Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 105:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: nonsmooth problems• Consider the following nonsmooth optimization problem:

minimizex

f(x) + g(x)subject to x ∈ X .

• Consider the following nonsmooth optimization problem:

minimizex,y

f(x) + y

subject to x ∈ X , g(x) ≤ y.

• The approximate problem is

minimizex,y

f(x; xt) + y

subject to x ∈ X , g(x) ≤ y,

with optimal solution denoted as (Bxt, y?(xt)) and y?(xt) = g(Bxt).

• Assumption verification:

∇x(f(x; xt) + y)∣∣x=xt,y=yt = ∇xf(xt) = ∇x(f(x) + y)

∣∣x=xt ,

∇y(f(x; xt) + y) = 1 = ∇y(f(x) + y).

• Exact line search over f(x) + y:

γt ∈ arg min0≤γ≤1

f(xt + γ(Bxt − xt)) + yt + γ(y?(xt)− yt))

,

where yt ≥ g(xt) and y?(xt) = g(Bxt).

• Exact line search:

γt ∈ arg min0≤γ≤1

f(xt + γ(Bxt − xt)) +g(xt) + γ(g(Bxt)− g(xt)))

.

• Exact line search:

γt ∈ arg min0≤γ≤1

f(xt + γ(Bxt − xt)) + γ(g(Bxt)− g(xt))

.

86Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 106:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: nonsmooth problems

• Consider the following nonsmooth optimization problem:

minimizex

f(x) + g(x)subject to x ∈ X .

• Consider the following nonsmooth optimization problem:

minimizex,y

f(x) + y

subject to x ∈ X , g(x) ≤ y.

• The approximate problem is

minimizex,y

f(x; xt) + y

subject to x ∈ X , g(x) ≤ y,

with optimal solution denoted as (Bxt, y?(xt)) and y?(xt) = g(Bxt).

• Assumption verification:

∇x(f(x; xt) + y)∣∣x=xt,y=yt = ∇xf(xt) = ∇x(f(x) + y)

∣∣x=xt ,

∇y(f(x; xt) + y) = 1 = ∇y(f(x) + y).

• Exact line search over f(x) + y:

γt ∈ arg min0≤γ≤1

f(xt + γ(Bxt − xt)) + yt + γ(y?(xt)− yt))

,

where yt ≥ g(xt) and y?(xt) = g(Bxt).

• Exact line search:

γt ∈ arg min0≤γ≤1

f(xt + γ(Bxt − xt)) +g(xt) + γ(g(Bxt)− g(xt)))

.

• Exact line search:

γt ∈ arg min0≤γ≤1

f(xt + γ(Bxt − xt)) + γ(g(Bxt)− g(xt))

.

86Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 107:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: nonsmooth problems

• Consider the following nonsmooth optimization problem:

minimizex

f(x) + g(x)subject to x ∈ X .

• Consider the following nonsmooth optimization problem:

minimizex,y

f(x) + y

subject to x ∈ X , g(x) ≤ y.

• The approximate problem is

minimizex,y

f(x; xt) + y

subject to x ∈ X , g(x) ≤ y,

with optimal solution denoted as (Bxt, y?(xt)) and y?(xt) = g(Bxt).

• Assumption verification:

∇x(f(x; xt) + y)∣∣x=xt,y=yt = ∇xf(xt) = ∇x(f(x) + y)

∣∣x=xt ,

∇y(f(x; xt) + y) = 1 = ∇y(f(x) + y).

• Exact line search over f(x) + y:

γt ∈ arg min0≤γ≤1

f(xt + γ(Bxt − xt)) + yt + γ(y?(xt)− yt))

,

where yt ≥ g(xt) and y?(xt) = g(Bxt).

• Exact line search:

γt ∈ arg min0≤γ≤1

f(xt + γ(Bxt − xt)) +g(xt) + γ(g(Bxt)− g(xt)))

.

• Exact line search:

γt ∈ arg min0≤γ≤1

f(xt + γ(Bxt − xt)) + γ(g(Bxt)− g(xt))

.

86Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 108:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: nonsmooth problems

• Consider the following nonsmooth optimization problem:

minimizex

f(x) + g(x)subject to x ∈ X .

• Consider the following nonsmooth optimization problem:

minimizex,y

f(x) + y

subject to x ∈ X , g(x) ≤ y.

• The approximate problem is

minimizex,y

f(x; xt) + y

subject to x ∈ X , g(x) ≤ y,

with optimal solution denoted as (Bxt, y?(xt)) and y?(xt) = g(Bxt).

• Assumption verification:

∇x(f(x; xt) + y)∣∣x=xt,y=yt = ∇xf(xt) = ∇x(f(x) + y)

∣∣x=xt ,

∇y(f(x; xt) + y) = 1 = ∇y(f(x) + y).

• Exact line search over f(x) + y:

γt ∈ arg min0≤γ≤1

f(xt + γ(Bxt − xt)) + yt + γ(y?(xt)− yt))

,

where yt ≥ g(xt) and y?(xt) = g(Bxt).

• Exact line search:

γt ∈ arg min0≤γ≤1

f(xt + γ(Bxt − xt)) +g(xt) + γ(g(Bxt)− g(xt)))

.

• Exact line search:

γt ∈ arg min0≤γ≤1

f(xt + γ(Bxt − xt)) + γ(g(Bxt)− g(xt))

.

86Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 109:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: nonsmooth problems

• Consider the following nonsmooth optimization problem:

minimizex

f(x) + g(x)subject to x ∈ X .

• Consider the following nonsmooth optimization problem:

minimizex,y

f(x) + y

subject to x ∈ X , g(x) ≤ y.

• The approximate problem is

minimizex,y

f(x; xt) + y

subject to x ∈ X , g(x) ≤ y,

with optimal solution denoted as (Bxt, y?(xt)) and y?(xt) = g(Bxt).

• Assumption verification:

∇x(f(x; xt) + y)∣∣x=xt,y=yt = ∇xf(xt) = ∇x(f(x) + y)

∣∣x=xt ,

∇y(f(x; xt) + y) = 1 = ∇y(f(x) + y).

• Exact line search over f(x) + y:

γt ∈ arg min0≤γ≤1

f(xt + γ(Bxt − xt)) + yt + γ(y?(xt)− yt))

,

where yt ≥ g(xt) and y?(xt) = g(Bxt).

• Exact line search:

γt ∈ arg min0≤γ≤1

f(xt + γ(Bxt − xt)) +g(xt) + γ(g(Bxt)− g(xt)))

.

• Exact line search:

γt ∈ arg min0≤γ≤1

f(xt + γ(Bxt − xt)) + γ(g(Bxt)− g(xt))

.

86Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 110:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A general framework: nonsmooth problems• Consider the following nonsmooth optimization problem:

minimizex

f(x) + g(x)subject to x ∈ X .

• The approximate problem is

minimizex

f(x; xt) + g(x)subject to x ∈ X ,

• Exact line search is over a differentiable function:

γt ∈ arg min0≤γ≤1

f(xt + γ(Bxt − xt)) + γ(g(Bxt)− g(xt))

.

• Traditional exact line search suffers from a high complexity:

γt ∈ arg min0≤γ≤1

f(xt + γ(Bxt − xt)) + g(xt + γ(Bxt − xt))

.

87Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 111:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

LASSO• An important and widely studied problem in sparse regularization:

minimizex

12 ‖Ax− b‖22︸ ︷︷ ︸

,f(x)

+µ ‖x‖1︸ ︷︷ ︸, g(x)

.

• The approximate problem with a scalar decomposition x = (xk)Kk=1 is:

Bxt = arg minx

∑Kk=1f(xk,xt−k)︸ ︷︷ ︸

=f(x;xt)

+g(x)

= D−1Sµ1(Dxt −AT (Axt − b))

where D , diag(ATA), Sa(b) = [b− a]+ − [−b− a]+.• Closed-form expression known as soft-thresholding operator.• This choice of approximate function is well known, see Elad (2006);Facchinei, Scutari, and Sagratella (2015).

88Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 112:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

LASSO• An important and widely studied problem in sparse signal recovery:

minimizex

12 ‖Ax− b‖22︸ ︷︷ ︸

,f(x)

+µ ‖x‖1︸ ︷︷ ︸, g(x)

.

• The exact line search has a closed-form solution:

γt = arg min0≤γ≤1

f(xt + γ(Bxt − xt)) + γ

(g(Bxt)− g(xt)

)= arg min

0≤γ≤1

12∥∥A(xt + γ(Bxt − xt))− b

∥∥22

+γµ(∥∥Bxt

∥∥1 −

∥∥xt∥∥1)

=[−⟨Axt − b,A(Bxt − xt)

⟩+ µ(

∥∥Bxt∥∥

1 −∥∥xt∥∥1)⟨

A(Bxt − xt),A(Bxt − xt)⟩ ]1

0.

• Soft-Thresholding with Exact Line search Algorithm (STELA, Yang andPesavento (2017)).

• The exact line search has a closed-form solution:

γt =[−⟨Axt − b,A(Bxt − xt)

⟩+ µ(

∥∥Bxt∥∥

1 −∥∥xt∥∥1)⟨

A(Bxt − xt),A(Bxt − xt)⟩ ]1

0.

• By comparison, the traditional exact line search is:

γt = arg min0≤γ≤1

f(xt + γ(Bxt − xt)) + g(xt + γ(Bxt − xt)

= arg min

0≤γ≤1

12∥∥A(xt + γ(Bxt − xt))− b

∥∥22

+µ∑Kk=1

∣∣xtk + γ(Bkxt − xtk)∣∣.

89Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 113:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

LASSO• An important and widely studied problem in sparse signal recovery:

minimizex

12 ‖Ax− b‖22︸ ︷︷ ︸

,f(x)

+µ ‖x‖1︸ ︷︷ ︸, g(x)

.

• The exact line search has a closed-form solution:

γt = arg min0≤γ≤1

f(xt + γ(Bxt − xt)) + γ

(g(Bxt)− g(xt)

)= arg min

0≤γ≤1

12∥∥A(xt + γ(Bxt − xt))− b

∥∥22

+γµ(∥∥Bxt

∥∥1 −

∥∥xt∥∥1)

=[−⟨Axt − b,A(Bxt − xt)

⟩+ µ(

∥∥Bxt∥∥

1 −∥∥xt∥∥1)⟨

A(Bxt − xt),A(Bxt − xt)⟩ ]1

0.

• Soft-Thresholding with Exact Line search Algorithm (STELA, Yang andPesavento (2017)).

• The exact line search has a closed-form solution:

γt =[−⟨Axt − b,A(Bxt − xt)

⟩+ µ(

∥∥Bxt∥∥

1 −∥∥xt∥∥1)⟨

A(Bxt − xt),A(Bxt − xt)⟩ ]1

0.

• By comparison, the traditional exact line search is:

γt = arg min0≤γ≤1

f(xt + γ(Bxt − xt)) + g(xt + γ(Bxt − xt)

= arg min

0≤γ≤1

12∥∥A(xt + γ(Bxt − xt))− b

∥∥22

+µ∑Kk=1

∣∣xtk + γ(Bkxt − xtk)∣∣.

89Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 114:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

LASSOState-of-the-art algorithms: FLEXA (Facchinei, Scutari, and Sagratella,2015), where

• same approximate function as STELA;• the decreasing stepsizes are used: γt+1 = γt(1− αγt), where α controlsthe decreasing rate.

50 100 150 200 250 300 350 400 450 500

10−6

10−4

10−2

100

number of iterations

erro

r e(

xt )

STELA: parallel update with simplified exact line search (proposed)FLEXA: parallel update with decreasing stepsize (state−of−the−art)

decreasing rate: 10−4

decreasing rate: 10−1

decreasing rate: 10−3decreasing rate: 10−2

90Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 115:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

LASSO

State-of-the-art algorithms:• FISTA (Beck and Teboulle, 2009)• ADMM (Boyd, Parikh, Chu, Peleato, and Eckstein, 2010)• GreedyBCD (Peng, Yan, and Yin, 2013)• SpaRSA (Wright, Nowak, and Figueiredo, 2009)

91Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 116:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

LASSO

0 5 10

10−5

100

time (sec)

erro

r

density: 0.1

0 5 10

10−5

100

density: 0.2

0 5 10

10−5

100

density: 0.4

0 50 100

10−5

100

time (sec)

erro

r

0 50 100

10−5

100

0 50 100

10−5

100

STELA (proposed)

ADMM

FISTA

GreedyBCD

SpaRSA

N ×K : 2000 × 4000

N ×K : 5000 × 10000

92Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 117:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

STELA extension to basis pursuit (BP)• The basis pursuit (BP ) problem is:

minimize ‖x‖1subject to Ax = b.

• It can be solved by the augmented Lagrangian approach.• The augmented Lagrangian is

Lc(x,µ) = ‖x‖1 + µT (Ax− b) + c

2‖Ax− b‖22.

• Using STELA to minimize the augmented Lagrangian:

x?(µt) , arg minx

Lc(x,µt).

• The dual variable µ is updated as follows:

µt+1 = µt + c(Ax?(µt)− b).

93Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 118:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

STELA extension to basis pursuit (BP)

• STELA for BP is tested on the benchmarking platform developed by theOptimization Group in Technische Universität Darmstadt.

• A total number of 100 instances of A and b are designed, reflectingdifferent levels of solution difficulty.

• STELA is compared with the following solvers:CPLEX (http://www.cplex.com)`1-Homotopy (http://users.ece.gatech.edu/sasif/homotopy/)SPGL1 (https://www.math.ucdavis.edu/~mpf/spgl1/)`1-Magic (http://statweb.stanford.edu/~candes/l1magic/)YALL1 (http://yall1.blogs.rice.edu/)ISAL1 (http://wwwopt.mathematik.tu-darmstadt.de/spear/)

94Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 119:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

STELA extension to basis pursuit (BP)

• STELA is one of the three solvers that solve all problem instances.• Note that HOC stands for "Heuristic Optimality Check" proposed inLorenz, Pfetsch, and Tillmann (2015).

95Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 120:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

STELA extension to basis pursuit (BP)

• Although STELA must be called multiple times (one for eachaugmented Lagrangian), it is the second fastest solver for BP.

• More simulation results are reported in Kuske and Tillmann (2016).

96Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 121:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

STELA extension to MD rank-sparse regularizationMultipath parameter estimation (Steffens, Yang, and Pesavento (2016))

• P propagation paths with ...complex channel gain coefficient β(0)

p ,propagation delay τ (0)

p andangle-of-arrival (AoA) θ(0)

p

β(0)1 δ(t− τ

(0)1 )

β(0)2 δ(t− τ

(0)2 )

β(0)3 δ(t− τ

(0)3 )

Source

LinearAntennaArray

Scatterer

Scatterer

θ3

θ2θ1

97Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 122:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Multipath parameter estimation

0 4 8 12 160

60

120

180

τ / Sampling Period

θ/degree

Channel Parameter Plane

0

0.5

1

Channelg

ain|β|

98Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 123:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Multipath propagation model

• Signal received by the antenna array in time instant ` (OFDM symbol)

Y (`) = A(0)H(0)(`)B(0)TX(`) +W (`) ∈ CM×N

with [Y (`)]m,n denoting the signal at antenna m on subcarrier n• Array steering matrix:A(0) =

[a(θ(0)

1 ), . . . ,a(θ(0)P )

]∈ CM×P , am(θ(0)

p ) = e−j rm2πλ

cos θ(0)p

• Frequency response matrix:B(0) =

[b(τ (0)

1 ), . . . , b(τ (0)P )

]∈ CN×P , bn(τ (0)

p ) = e−j2πNTs

(n−1)τ (0)p

• Channel gain matrix: H(0)(`) = diag(β

(0)1 (`), · · · , β(0)

P (`))∈ CP×P

• Reference signal matrix: X(`) = diag(x1(`), . . . , xN (`)

)∈ CN×N

• AWGN matrix: W (`)

99Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 124:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Multipath propagation model

• Collect L snapshots and define

Z= [vec(Y (1)X−1(1)

), . . . , vec

(Y (L)X−1(L)

)]

G(0)= [vecd

(H(0)(1)

), . . . , vecd

(H(0)(L)

)]

W= [vec(W (1)X−1(1)

), . . . , vec

(W (L)X−1(L)

)]

• Multiple snapshot signal model:

Z = (B(0) A(0)) G(0) +W ∈ CMN×L

with Khatri-Rao product , i.e. columnwise Kronecker product ⊗

100Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 125:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Multipath propagation model: structure• Structure in reformulated signal model (neglecting noise)

C(0) = B(0)⊗IM IP A(0) G(0)

Z =

︷ ︸︸ ︷b

(0)1 ⊗IM b

(0)2 ⊗IM · · · b

(0)P ⊗IM

......

...

︷ ︸︸ ︷a

(0)1 0 · · · 00 a

(0)2

... . . .0 a

(0)P

︷ ︸︸ ︷g

(0)T1g

(0)T2...g

(0)TP

=

b

(0)1 ⊗IM b

(0)2 ⊗IM · · · b

(0)P ⊗IM

......

...

︸ ︷︷ ︸

· G

(0)1

· G(0)2

...· G

(0)P

︸ ︷︷ ︸

C(0) G(0)

101Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 126:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Rank-sparse regularization

• The rank-sparse (nuclear norm) regularization problem is given as:

G = arg minG

12 ‖Z −CG‖

2F + µ

Q∑q=1‖Gq‖∗

• Decompose problem into Q approximate subproblems (Steffens, Yang,and Pesavento (2016)):

BG(t)q = arg min

Gq

12∥∥Z −C−qG(t)

−q −CqGq

∥∥2F + µ(

∥∥G(t)−q∥∥∗ +

∥∥Gq

∥∥∗)

with C−q = [C1, . . . ,Cq−1,Cq+1, . . . ,CQ]and G(t)

−q = [G(t)T1 , . . . ,G

(t)Tq−1 ,G

(t)Tq+1 , . . . ,G

(t)TQ ]T

102Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 127:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

The approximate function

• Closed-form solution for approximate problems:

BG(t)q = Sµ(Γ(t)

q )

where

Γ(t)q = (CH

q Cq)−1CHq (Z −C−qG(t)

−q)

is the Least-Squares estimate of Gq, with compact singular valuedecomposition Γ(t)

q = U (t)q Ω(t)

q V(t)Hq , and

Sµ(Γ(t)q ) = U (t)

q

(Ω(t)q − µI

)+V

(t)Hq

is the singular value thresholding operator, with[(X)+]ij = max([X]ij , 0)

103Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 128:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Update step

• Denote solutions to approximate problems as best-response matrix

BG(t) = [BG(t)T1 , . . . ,BG(t)T

Q ]T

• Update of approximate solution G(t) in iteration t by

G(t+1) = G(t) + γ(t)(BG(t) −G(t))

• Closed-form solution for update step size

γ(t) = arg min0≤γ≤1

12∥∥Z−C(G(t)+ γ(BG(t)−G(t)))

∥∥2F

+ γµ

Q∑q=1

(‖BG(t)q ‖∗− ‖G(t)

q ‖∗)

=[Re

Tr((CG(t)−Z)HC(BG(t)−G(t))

)+ µ

∑Q

q=1(‖BG(t)q ‖∗− ‖G(t)

q ‖∗)∥∥C(BG(t)−G(t))∥∥2F

]1

0

104Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 129:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Questions? Thank you!

105Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 130:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

J. M. Ortega and W. C. Rheinboldt, Iterative solution of nonlinear equationsin several variables. Academic, New York, 1970.

S. Boyd and L. Vandenberghe, Convex optimization. Cambridge Univ Pr,2004.

O. L. Mangasarian, Nonlinear programming. McGraw-Hill, 1969.Y. Yang and M. Pesavento, “A Unified Successive PseudoconvexApproximation Framework,” IEEE Transactions on Signal Processing,vol. 65, no. 13, pp. 3313–3328, 2017.

M. Razaviyayn, M. Hong, and Z.-q. Luo, “A Unified Convergence Analysis ofBlock Successive Minimization Methods for Nonsmooth Optimization,”SIAM Journal on Optimization, vol. 23, no. 2, pp. 1126–1153, jan 2013.[Online]. Available: http://epubs.siam.org/doi/10.1137/120891009

Y. Sun, P. Babu, and D. P. Palomar, “Majorization-Minimization Algorithmsin Signal Processing, Communications, and Machine Learning,” IEEETransactions on Signal Processing, vol. 65, no. 3, pp. 794–816, 2017.

A. Zappone, E. Björnson, L. Sanguinetti, and E. Jorswieck, “GloballyOptimal Energy-Efficient Power Control and Receiver Design in Wireless

105Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 131:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

Networks,” Ieee Transactions on Signal Processing, vol. 65, no. 11, pp.2844–2859, 2017.

D. P. Bertsekas, Nonlinear programming, 3rd ed. Athena Scientific, 2016.N. Parikh and S. Boyd, “Proximal Algorithms,” Foundations and Trends inOptimization, vol. 1, no. 3, pp. 127–239, 2014. [Online]. Available:http://www.nowpublishers.com/articles/foundations-and-trends-in-optimization/OPT-003

G. Scutari, F. Facchinei, P. Song, D. P. Palomar, and J.-S. Pang,“Decomposition by Partial Linearization: Parallel Optimization ofMulti-Agent Systems,” IEEE Transactions on Signal Processing, vol. 62,no. 3, pp. 641–656, feb 2014. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6675875

W. Yu, W. Rhee, S. Boyd, and J. Cioffi, “Iterative Water-Filling for GaussianVector Multiple-Access Channels,” IEEE Transactions on InformationTheory, vol. 50, no. 1, pp. 145–152, jan 2004. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=1262622

J. Bunch, C. Nielsen, and D. Sorensen, “Rank-one modification of the

105Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 132:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

symmetric eigenproblem,” Numerische Mathematik, vol. 31, pp. 31–48,1978.

N. Jindal, W. Rhee, S. Vishwanath, S. Jafar, and A. Goldsmith, “Sum PowerIterative Water-Filling for Multi-Antenna Gaussian Broadcast Channels,”IEEE Transactions on Information Theory, vol. 51, no. 4, pp. 1570–1580,apr 2005. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=1412050

Z.-Q. Luo and S. Zhang, “Dynamic Spectrum Management: Complexity andDuality,” IEEE Journal of Selected Topics in Signal Processing, vol. 2,no. 1, pp. 57–73, feb 2008. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=4453890

M. Elad, “Why simple shrinkage is still relevant for redundantrepresentations?” IEEE Transactions on Information Theory, vol. 52,no. 12, pp. 5559–5569, 2006.

F. Facchinei, G. Scutari, and S. Sagratella, “Parallel Selective Algorithms forNonconvex Big Data Optimization,” IEEE Transactions on SignalProcessing, vol. 63, no. 7, pp. 1874–1889, nov 2015. [Online]. Available:http://arxiv.org/abs/1402.5521

105Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 133:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

A. Beck and M. Teboulle, “A Fast Iterative Shrinkage-ThresholdingAlgorithm for Linear Inverse Problems,” SIAM Journal on ImagingSciences, vol. 2, no. 1, pp. 183–202, jan 2009. [Online]. Available:http://epubs.siam.org/doi/abs/10.1137/080716542

S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “DistributedOptimization and Statistical Learning via the Alternating DirectionMethod of Multipliers,” Foundations and Trends in Machine Learning,vol. 3, no. 1, 2010. [Online]. Available: http://www.nowpublishers.com/product.aspx?product=MAL&doi=2200000016

Z. Peng, M. Yan, and W. Yin, “Parallel and distributed sparse optimization,”2013 Asilomar Conference on Signals, Systems and Computers, pp.659–646, nov 2013. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6810364

S. Wright, R. Nowak, and M. Figueiredo, “Sparse Reconstruction bySeparable Approximation,” IEEE Transactions on Signal Processing,vol. 57, no. 7, pp. 2479–2493, jul 2009. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=4799134

105Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt

Page 134:  · MIMOBCcapacitymaximization. State-of-the-art: theiterativealgorithminJindal,Rhee,Vishwanath, Jafar,andGoldsmith(2005): Sameapproximatefunction ...

D. Lorenz, M. Pfetsch, and A. Tillmann, “Solving Basis Pursuit: HeuristicOptimality Check and Solver Comparison,” ACM Transactions onMathematical Software, vol. 41, no. 2, pp. 1–29, 2015.

J. Kuske and A. Tillmann, “Solving basis pursuit: Update,” TechnischeUniversität Darmstadt, Tech. Rep., April 2016. [Online]. Available:http://www.mathematik.tu-darmstadt.de/~tillmann/docs/SolvingBPupdateApr2016.pdf

C. Steffens, Y. Yang, and M. Pesavento, “Multidimensional sparse recoveryfor mimo channel parameter estimation,” in 2016 24th European SignalProcessing Conference (EUSIPCO), Aug 2016, pp. 66–70.

105Yang Yang and Marius PesaventoUniversity of Luxembourg and Technische Universität Darmstadt