New primal-dual subgradient methods for Convex Problems...
Transcript of New primal-dual subgradient methods for Convex Problems...
![Page 1: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/1.jpg)
New primal-dual subgradient methodsfor Convex Problems with Functional Constraints
Yurii Nesterov, CORE/INMA (UCL)
January 12, 2015 (Les Houches)
Yu. Nesterov New primal-dual methods for functional constraints 1/19
![Page 2: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/2.jpg)
Outline
1 Constrained optimization problem
2 Lagrange multipliers
3 Dual function and dual problem
4 Augmented Lagrangian
5 Switching subgradient methods
6 Finding the dual multipliers
7 Complexity analysis
Yu. Nesterov New primal-dual methods for functional constraints 2/19
![Page 3: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/3.jpg)
Optimization problem: simple constraints
Consider the problem: minx∈Q
f (x), where
Q is a closed convex set: x , y ∈ Q ⇒ [x , y ] ⊆ Q,
f is a subdifferentiable on Q convex function:
f (y) ≥ f (x) + 〈∇f (x), y − x〉, x , y ∈ Q, ∇f (x) ∈ ∂f (x).
Optimality condition: point x∗ ∈ Q is optimal iff
〈∇f (x∗), x − x∗〉 ≥ 0, ∀x ∈ Q.
Interpretation: Function increases along any feasible direction.
Yu. Nesterov New primal-dual methods for functional constraints 3/19
![Page 4: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/4.jpg)
Optimization problem: simple constraints
Consider the problem: minx∈Q
f (x),
where
Q is a closed convex set: x , y ∈ Q ⇒ [x , y ] ⊆ Q,
f is a subdifferentiable on Q convex function:
f (y) ≥ f (x) + 〈∇f (x), y − x〉, x , y ∈ Q, ∇f (x) ∈ ∂f (x).
Optimality condition: point x∗ ∈ Q is optimal iff
〈∇f (x∗), x − x∗〉 ≥ 0, ∀x ∈ Q.
Interpretation: Function increases along any feasible direction.
Yu. Nesterov New primal-dual methods for functional constraints 3/19
![Page 5: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/5.jpg)
Optimization problem: simple constraints
Consider the problem: minx∈Q
f (x), where
Q is a closed convex set: x , y ∈ Q ⇒ [x , y ] ⊆ Q,
f is a subdifferentiable on Q convex function:
f (y) ≥ f (x) + 〈∇f (x), y − x〉, x , y ∈ Q, ∇f (x) ∈ ∂f (x).
Optimality condition: point x∗ ∈ Q is optimal iff
〈∇f (x∗), x − x∗〉 ≥ 0, ∀x ∈ Q.
Interpretation: Function increases along any feasible direction.
Yu. Nesterov New primal-dual methods for functional constraints 3/19
![Page 6: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/6.jpg)
Optimization problem: simple constraints
Consider the problem: minx∈Q
f (x), where
Q is a closed convex set:
x , y ∈ Q ⇒ [x , y ] ⊆ Q,
f is a subdifferentiable on Q convex function:
f (y) ≥ f (x) + 〈∇f (x), y − x〉, x , y ∈ Q, ∇f (x) ∈ ∂f (x).
Optimality condition: point x∗ ∈ Q is optimal iff
〈∇f (x∗), x − x∗〉 ≥ 0, ∀x ∈ Q.
Interpretation: Function increases along any feasible direction.
Yu. Nesterov New primal-dual methods for functional constraints 3/19
![Page 7: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/7.jpg)
Optimization problem: simple constraints
Consider the problem: minx∈Q
f (x), where
Q is a closed convex set: x , y ∈ Q ⇒ [x , y ] ⊆ Q,
f is a subdifferentiable on Q convex function:
f (y) ≥ f (x) + 〈∇f (x), y − x〉, x , y ∈ Q, ∇f (x) ∈ ∂f (x).
Optimality condition: point x∗ ∈ Q is optimal iff
〈∇f (x∗), x − x∗〉 ≥ 0, ∀x ∈ Q.
Interpretation: Function increases along any feasible direction.
Yu. Nesterov New primal-dual methods for functional constraints 3/19
![Page 8: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/8.jpg)
Optimization problem: simple constraints
Consider the problem: minx∈Q
f (x), where
Q is a closed convex set: x , y ∈ Q ⇒ [x , y ] ⊆ Q,
f is a subdifferentiable on Q convex function:
f (y) ≥ f (x) + 〈∇f (x), y − x〉, x , y ∈ Q, ∇f (x) ∈ ∂f (x).
Optimality condition: point x∗ ∈ Q is optimal iff
〈∇f (x∗), x − x∗〉 ≥ 0, ∀x ∈ Q.
Interpretation: Function increases along any feasible direction.
Yu. Nesterov New primal-dual methods for functional constraints 3/19
![Page 9: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/9.jpg)
Optimization problem: simple constraints
Consider the problem: minx∈Q
f (x), where
Q is a closed convex set: x , y ∈ Q ⇒ [x , y ] ⊆ Q,
f is a subdifferentiable on Q convex function:
f (y) ≥ f (x) + 〈∇f (x), y − x〉, x , y ∈ Q, ∇f (x) ∈ ∂f (x).
Optimality condition: point x∗ ∈ Q is optimal iff
〈∇f (x∗), x − x∗〉 ≥ 0, ∀x ∈ Q.
Interpretation: Function increases along any feasible direction.
Yu. Nesterov New primal-dual methods for functional constraints 3/19
![Page 10: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/10.jpg)
Optimization problem: simple constraints
Consider the problem: minx∈Q
f (x), where
Q is a closed convex set: x , y ∈ Q ⇒ [x , y ] ⊆ Q,
f is a subdifferentiable on Q convex function:
f (y) ≥ f (x) + 〈∇f (x), y − x〉, x , y ∈ Q, ∇f (x) ∈ ∂f (x).
Optimality condition:
point x∗ ∈ Q is optimal iff
〈∇f (x∗), x − x∗〉 ≥ 0, ∀x ∈ Q.
Interpretation: Function increases along any feasible direction.
Yu. Nesterov New primal-dual methods for functional constraints 3/19
![Page 11: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/11.jpg)
Optimization problem: simple constraints
Consider the problem: minx∈Q
f (x), where
Q is a closed convex set: x , y ∈ Q ⇒ [x , y ] ⊆ Q,
f is a subdifferentiable on Q convex function:
f (y) ≥ f (x) + 〈∇f (x), y − x〉, x , y ∈ Q, ∇f (x) ∈ ∂f (x).
Optimality condition: point x∗ ∈ Q is optimal iff
〈∇f (x∗), x − x∗〉 ≥ 0, ∀x ∈ Q.
Interpretation: Function increases along any feasible direction.
Yu. Nesterov New primal-dual methods for functional constraints 3/19
![Page 12: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/12.jpg)
Optimization problem: simple constraints
Consider the problem: minx∈Q
f (x), where
Q is a closed convex set: x , y ∈ Q ⇒ [x , y ] ⊆ Q,
f is a subdifferentiable on Q convex function:
f (y) ≥ f (x) + 〈∇f (x), y − x〉, x , y ∈ Q, ∇f (x) ∈ ∂f (x).
Optimality condition: point x∗ ∈ Q is optimal iff
〈∇f (x∗), x − x∗〉 ≥ 0, ∀x ∈ Q.
Interpretation: Function increases along any feasible direction.
Yu. Nesterov New primal-dual methods for functional constraints 3/19
![Page 13: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/13.jpg)
Optimization problem: simple constraints
Consider the problem: minx∈Q
f (x), where
Q is a closed convex set: x , y ∈ Q ⇒ [x , y ] ⊆ Q,
f is a subdifferentiable on Q convex function:
f (y) ≥ f (x) + 〈∇f (x), y − x〉, x , y ∈ Q, ∇f (x) ∈ ∂f (x).
Optimality condition: point x∗ ∈ Q is optimal iff
〈∇f (x∗), x − x∗〉 ≥ 0, ∀x ∈ Q.
Interpretation:
Function increases along any feasible direction.
Yu. Nesterov New primal-dual methods for functional constraints 3/19
![Page 14: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/14.jpg)
Optimization problem: simple constraints
Consider the problem: minx∈Q
f (x), where
Q is a closed convex set: x , y ∈ Q ⇒ [x , y ] ⊆ Q,
f is a subdifferentiable on Q convex function:
f (y) ≥ f (x) + 〈∇f (x), y − x〉, x , y ∈ Q, ∇f (x) ∈ ∂f (x).
Optimality condition: point x∗ ∈ Q is optimal iff
〈∇f (x∗), x − x∗〉 ≥ 0, ∀x ∈ Q.
Interpretation: Function increases along any feasible direction.
Yu. Nesterov New primal-dual methods for functional constraints 3/19
![Page 15: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/15.jpg)
Optimization problem: functional constraints
Problem: minx∈Q{f0(x), fi (x) ≤ 0, i = 1, . . . ,m}, where
Q is a closed convex set,
all fi are convex and subdifferentiable on Q, i = 0, . . . ,m:
fi (y) ≥ fi (x) + 〈∇fi (x), y − x〉, x , y ∈ Q, ∇fi (x) ∈ ∂fi (x).
Optimality condition (KKT, 1951): point x∗ ∈ Q is optimal iff
there exist Lagrange multipliers λ(i)∗ ≥ 0, i = 1, . . . ,m, such that
(1) : 〈∇f0(x∗) +m∑
i=1λ
(i)∗ ∇fi (x∗), x − x∗〉 ≥ 0, ∀x ∈ Q,
(2) : fi (x∗) ≤ 0, i = 1, . . . ,m, (feasibility)
(3) : λ(i)∗ fi (x∗) = 0, i = 1, . . . ,m. (complementary slackness)
Yu. Nesterov New primal-dual methods for functional constraints 4/19
![Page 16: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/16.jpg)
Optimization problem: functional constraints
Problem:
minx∈Q{f0(x), fi (x) ≤ 0, i = 1, . . . ,m}, where
Q is a closed convex set,
all fi are convex and subdifferentiable on Q, i = 0, . . . ,m:
fi (y) ≥ fi (x) + 〈∇fi (x), y − x〉, x , y ∈ Q, ∇fi (x) ∈ ∂fi (x).
Optimality condition (KKT, 1951): point x∗ ∈ Q is optimal iff
there exist Lagrange multipliers λ(i)∗ ≥ 0, i = 1, . . . ,m, such that
(1) : 〈∇f0(x∗) +m∑
i=1λ
(i)∗ ∇fi (x∗), x − x∗〉 ≥ 0, ∀x ∈ Q,
(2) : fi (x∗) ≤ 0, i = 1, . . . ,m, (feasibility)
(3) : λ(i)∗ fi (x∗) = 0, i = 1, . . . ,m. (complementary slackness)
Yu. Nesterov New primal-dual methods for functional constraints 4/19
![Page 17: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/17.jpg)
Optimization problem: functional constraints
Problem: minx∈Q{f0(x), fi (x) ≤ 0, i = 1, . . . ,m},
where
Q is a closed convex set,
all fi are convex and subdifferentiable on Q, i = 0, . . . ,m:
fi (y) ≥ fi (x) + 〈∇fi (x), y − x〉, x , y ∈ Q, ∇fi (x) ∈ ∂fi (x).
Optimality condition (KKT, 1951): point x∗ ∈ Q is optimal iff
there exist Lagrange multipliers λ(i)∗ ≥ 0, i = 1, . . . ,m, such that
(1) : 〈∇f0(x∗) +m∑
i=1λ
(i)∗ ∇fi (x∗), x − x∗〉 ≥ 0, ∀x ∈ Q,
(2) : fi (x∗) ≤ 0, i = 1, . . . ,m, (feasibility)
(3) : λ(i)∗ fi (x∗) = 0, i = 1, . . . ,m. (complementary slackness)
Yu. Nesterov New primal-dual methods for functional constraints 4/19
![Page 18: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/18.jpg)
Optimization problem: functional constraints
Problem: minx∈Q{f0(x), fi (x) ≤ 0, i = 1, . . . ,m}, where
Q is a closed convex set,
all fi are convex and subdifferentiable on Q, i = 0, . . . ,m:
fi (y) ≥ fi (x) + 〈∇fi (x), y − x〉, x , y ∈ Q, ∇fi (x) ∈ ∂fi (x).
Optimality condition (KKT, 1951): point x∗ ∈ Q is optimal iff
there exist Lagrange multipliers λ(i)∗ ≥ 0, i = 1, . . . ,m, such that
(1) : 〈∇f0(x∗) +m∑
i=1λ
(i)∗ ∇fi (x∗), x − x∗〉 ≥ 0, ∀x ∈ Q,
(2) : fi (x∗) ≤ 0, i = 1, . . . ,m, (feasibility)
(3) : λ(i)∗ fi (x∗) = 0, i = 1, . . . ,m. (complementary slackness)
Yu. Nesterov New primal-dual methods for functional constraints 4/19
![Page 19: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/19.jpg)
Optimization problem: functional constraints
Problem: minx∈Q{f0(x), fi (x) ≤ 0, i = 1, . . . ,m}, where
Q is a closed convex set,
all fi are convex and subdifferentiable on Q, i = 0, . . . ,m:
fi (y) ≥ fi (x) + 〈∇fi (x), y − x〉, x , y ∈ Q, ∇fi (x) ∈ ∂fi (x).
Optimality condition (KKT, 1951): point x∗ ∈ Q is optimal iff
there exist Lagrange multipliers λ(i)∗ ≥ 0, i = 1, . . . ,m, such that
(1) : 〈∇f0(x∗) +m∑
i=1λ
(i)∗ ∇fi (x∗), x − x∗〉 ≥ 0, ∀x ∈ Q,
(2) : fi (x∗) ≤ 0, i = 1, . . . ,m, (feasibility)
(3) : λ(i)∗ fi (x∗) = 0, i = 1, . . . ,m. (complementary slackness)
Yu. Nesterov New primal-dual methods for functional constraints 4/19
![Page 20: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/20.jpg)
Optimization problem: functional constraints
Problem: minx∈Q{f0(x), fi (x) ≤ 0, i = 1, . . . ,m}, where
Q is a closed convex set,
all fi are convex and subdifferentiable on Q, i = 0, . . . ,m:
fi (y) ≥ fi (x) + 〈∇fi (x), y − x〉, x , y ∈ Q, ∇fi (x) ∈ ∂fi (x).
Optimality condition (KKT, 1951): point x∗ ∈ Q is optimal iff
there exist Lagrange multipliers λ(i)∗ ≥ 0, i = 1, . . . ,m, such that
(1) : 〈∇f0(x∗) +m∑
i=1λ
(i)∗ ∇fi (x∗), x − x∗〉 ≥ 0, ∀x ∈ Q,
(2) : fi (x∗) ≤ 0, i = 1, . . . ,m, (feasibility)
(3) : λ(i)∗ fi (x∗) = 0, i = 1, . . . ,m. (complementary slackness)
Yu. Nesterov New primal-dual methods for functional constraints 4/19
![Page 21: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/21.jpg)
Optimization problem: functional constraints
Problem: minx∈Q{f0(x), fi (x) ≤ 0, i = 1, . . . ,m}, where
Q is a closed convex set,
all fi are convex and subdifferentiable on Q, i = 0, . . . ,m:
fi (y) ≥ fi (x) + 〈∇fi (x), y − x〉, x , y ∈ Q, ∇fi (x) ∈ ∂fi (x).
Optimality condition (KKT, 1951): point x∗ ∈ Q is optimal iff
there exist Lagrange multipliers λ(i)∗ ≥ 0, i = 1, . . . ,m, such that
(1) : 〈∇f0(x∗) +m∑
i=1λ
(i)∗ ∇fi (x∗), x − x∗〉 ≥ 0, ∀x ∈ Q,
(2) : fi (x∗) ≤ 0, i = 1, . . . ,m, (feasibility)
(3) : λ(i)∗ fi (x∗) = 0, i = 1, . . . ,m. (complementary slackness)
Yu. Nesterov New primal-dual methods for functional constraints 4/19
![Page 22: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/22.jpg)
Optimization problem: functional constraints
Problem: minx∈Q{f0(x), fi (x) ≤ 0, i = 1, . . . ,m}, where
Q is a closed convex set,
all fi are convex and subdifferentiable on Q, i = 0, . . . ,m:
fi (y) ≥ fi (x) + 〈∇fi (x), y − x〉, x , y ∈ Q, ∇fi (x) ∈ ∂fi (x).
Optimality condition (KKT, 1951):
point x∗ ∈ Q is optimal iff
there exist Lagrange multipliers λ(i)∗ ≥ 0, i = 1, . . . ,m, such that
(1) : 〈∇f0(x∗) +m∑
i=1λ
(i)∗ ∇fi (x∗), x − x∗〉 ≥ 0, ∀x ∈ Q,
(2) : fi (x∗) ≤ 0, i = 1, . . . ,m, (feasibility)
(3) : λ(i)∗ fi (x∗) = 0, i = 1, . . . ,m. (complementary slackness)
Yu. Nesterov New primal-dual methods for functional constraints 4/19
![Page 23: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/23.jpg)
Optimization problem: functional constraints
Problem: minx∈Q{f0(x), fi (x) ≤ 0, i = 1, . . . ,m}, where
Q is a closed convex set,
all fi are convex and subdifferentiable on Q, i = 0, . . . ,m:
fi (y) ≥ fi (x) + 〈∇fi (x), y − x〉, x , y ∈ Q, ∇fi (x) ∈ ∂fi (x).
Optimality condition (KKT, 1951): point x∗ ∈ Q is optimal iff
there exist Lagrange multipliers λ(i)∗ ≥ 0, i = 1, . . . ,m, such that
(1) : 〈∇f0(x∗) +m∑
i=1λ
(i)∗ ∇fi (x∗), x − x∗〉 ≥ 0, ∀x ∈ Q,
(2) : fi (x∗) ≤ 0, i = 1, . . . ,m, (feasibility)
(3) : λ(i)∗ fi (x∗) = 0, i = 1, . . . ,m. (complementary slackness)
Yu. Nesterov New primal-dual methods for functional constraints 4/19
![Page 24: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/24.jpg)
Optimization problem: functional constraints
Problem: minx∈Q{f0(x), fi (x) ≤ 0, i = 1, . . . ,m}, where
Q is a closed convex set,
all fi are convex and subdifferentiable on Q, i = 0, . . . ,m:
fi (y) ≥ fi (x) + 〈∇fi (x), y − x〉, x , y ∈ Q, ∇fi (x) ∈ ∂fi (x).
Optimality condition (KKT, 1951): point x∗ ∈ Q is optimal iff
there exist Lagrange multipliers λ(i)∗ ≥ 0, i = 1, . . . ,m, such that
(1) : 〈∇f0(x∗) +m∑
i=1λ
(i)∗ ∇fi (x∗), x − x∗〉 ≥ 0, ∀x ∈ Q,
(2) : fi (x∗) ≤ 0, i = 1, . . . ,m, (feasibility)
(3) : λ(i)∗ fi (x∗) = 0, i = 1, . . . ,m. (complementary slackness)
Yu. Nesterov New primal-dual methods for functional constraints 4/19
![Page 25: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/25.jpg)
Optimization problem: functional constraints
Problem: minx∈Q{f0(x), fi (x) ≤ 0, i = 1, . . . ,m}, where
Q is a closed convex set,
all fi are convex and subdifferentiable on Q, i = 0, . . . ,m:
fi (y) ≥ fi (x) + 〈∇fi (x), y − x〉, x , y ∈ Q, ∇fi (x) ∈ ∂fi (x).
Optimality condition (KKT, 1951): point x∗ ∈ Q is optimal iff
there exist Lagrange multipliers λ(i)∗ ≥ 0, i = 1, . . . ,m, such that
(1) : 〈∇f0(x∗) +m∑
i=1λ
(i)∗ ∇fi (x∗), x − x∗〉 ≥ 0, ∀x ∈ Q,
(2) : fi (x∗) ≤ 0, i = 1, . . . ,m, (feasibility)
(3) : λ(i)∗ fi (x∗) = 0, i = 1, . . . ,m. (complementary slackness)
Yu. Nesterov New primal-dual methods for functional constraints 4/19
![Page 26: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/26.jpg)
Optimization problem: functional constraints
Problem: minx∈Q{f0(x), fi (x) ≤ 0, i = 1, . . . ,m}, where
Q is a closed convex set,
all fi are convex and subdifferentiable on Q, i = 0, . . . ,m:
fi (y) ≥ fi (x) + 〈∇fi (x), y − x〉, x , y ∈ Q, ∇fi (x) ∈ ∂fi (x).
Optimality condition (KKT, 1951): point x∗ ∈ Q is optimal iff
there exist Lagrange multipliers λ(i)∗ ≥ 0, i = 1, . . . ,m, such that
(1) : 〈∇f0(x∗) +m∑
i=1λ
(i)∗ ∇fi (x∗), x − x∗〉 ≥ 0, ∀x ∈ Q,
(2) : fi (x∗) ≤ 0, i = 1, . . . ,m, (feasibility)
(3) : λ(i)∗ fi (x∗) = 0, i = 1, . . . ,m. (complementary slackness)
Yu. Nesterov New primal-dual methods for functional constraints 4/19
![Page 27: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/27.jpg)
Optimization problem: functional constraints
Problem: minx∈Q{f0(x), fi (x) ≤ 0, i = 1, . . . ,m}, where
Q is a closed convex set,
all fi are convex and subdifferentiable on Q, i = 0, . . . ,m:
fi (y) ≥ fi (x) + 〈∇fi (x), y − x〉, x , y ∈ Q, ∇fi (x) ∈ ∂fi (x).
Optimality condition (KKT, 1951): point x∗ ∈ Q is optimal iff
there exist Lagrange multipliers λ(i)∗ ≥ 0, i = 1, . . . ,m, such that
(1) : 〈∇f0(x∗) +m∑
i=1λ
(i)∗ ∇fi (x∗), x − x∗〉 ≥ 0, ∀x ∈ Q,
(2) : fi (x∗) ≤ 0, i = 1, . . . ,m, (feasibility)
(3) : λ(i)∗ fi (x∗) = 0, i = 1, . . . ,m. (complementary slackness)
Yu. Nesterov New primal-dual methods for functional constraints 4/19
![Page 28: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/28.jpg)
Lagrange multipliers: interpretation
Let I ⊆ {1, . . . ,m} be an arbitrary set of indexes.
Denote fI(x) = f0(x) +∑i∈I
λ(i)∗ fi (x). Consider the problem
PI : minx∈Q{fI(x) : fi (x) ≤ 0, i 6∈ I}.
Observation: in any case, x∗ is the optimal solution ofproblem PI .
Interpretation: λ(i)∗ are the shadow prices for resources.
(Kantorovich, 1939)
Application examples:
Traffic congestion: car flows on roads ⇔ size of queues.
Electrical networks: currents in the wires ⇔ voltagepotentials, etc.
Main question: How to compute (x∗, λ∗)?
Yu. Nesterov New primal-dual methods for functional constraints 5/19
![Page 29: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/29.jpg)
Lagrange multipliers: interpretation
Let I ⊆ {1, . . . ,m} be an arbitrary set of indexes.
Denote fI(x) = f0(x) +∑i∈I
λ(i)∗ fi (x). Consider the problem
PI : minx∈Q{fI(x) : fi (x) ≤ 0, i 6∈ I}.
Observation: in any case, x∗ is the optimal solution ofproblem PI .
Interpretation: λ(i)∗ are the shadow prices for resources.
(Kantorovich, 1939)
Application examples:
Traffic congestion: car flows on roads ⇔ size of queues.
Electrical networks: currents in the wires ⇔ voltagepotentials, etc.
Main question: How to compute (x∗, λ∗)?
Yu. Nesterov New primal-dual methods for functional constraints 5/19
![Page 30: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/30.jpg)
Lagrange multipliers: interpretation
Let I ⊆ {1, . . . ,m} be an arbitrary set of indexes.
Denote fI(x) = f0(x) +∑i∈I
λ(i)∗ fi (x).
Consider the problem
PI : minx∈Q{fI(x) : fi (x) ≤ 0, i 6∈ I}.
Observation: in any case, x∗ is the optimal solution ofproblem PI .
Interpretation: λ(i)∗ are the shadow prices for resources.
(Kantorovich, 1939)
Application examples:
Traffic congestion: car flows on roads ⇔ size of queues.
Electrical networks: currents in the wires ⇔ voltagepotentials, etc.
Main question: How to compute (x∗, λ∗)?
Yu. Nesterov New primal-dual methods for functional constraints 5/19
![Page 31: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/31.jpg)
Lagrange multipliers: interpretation
Let I ⊆ {1, . . . ,m} be an arbitrary set of indexes.
Denote fI(x) = f0(x) +∑i∈I
λ(i)∗ fi (x). Consider the problem
PI : minx∈Q{fI(x) : fi (x) ≤ 0, i 6∈ I}.
Observation: in any case, x∗ is the optimal solution ofproblem PI .
Interpretation: λ(i)∗ are the shadow prices for resources.
(Kantorovich, 1939)
Application examples:
Traffic congestion: car flows on roads ⇔ size of queues.
Electrical networks: currents in the wires ⇔ voltagepotentials, etc.
Main question: How to compute (x∗, λ∗)?
Yu. Nesterov New primal-dual methods for functional constraints 5/19
![Page 32: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/32.jpg)
Lagrange multipliers: interpretation
Let I ⊆ {1, . . . ,m} be an arbitrary set of indexes.
Denote fI(x) = f0(x) +∑i∈I
λ(i)∗ fi (x). Consider the problem
PI : minx∈Q{fI(x) : fi (x) ≤ 0, i 6∈ I}.
Observation: in any case, x∗ is the optimal solution ofproblem PI .
Interpretation: λ(i)∗ are the shadow prices for resources.
(Kantorovich, 1939)
Application examples:
Traffic congestion: car flows on roads ⇔ size of queues.
Electrical networks: currents in the wires ⇔ voltagepotentials, etc.
Main question: How to compute (x∗, λ∗)?
Yu. Nesterov New primal-dual methods for functional constraints 5/19
![Page 33: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/33.jpg)
Lagrange multipliers: interpretation
Let I ⊆ {1, . . . ,m} be an arbitrary set of indexes.
Denote fI(x) = f0(x) +∑i∈I
λ(i)∗ fi (x). Consider the problem
PI : minx∈Q{fI(x) : fi (x) ≤ 0, i 6∈ I}.
Observation: in any case, x∗ is the optimal solution ofproblem PI .
Interpretation: λ(i)∗ are the shadow prices for resources.
(Kantorovich, 1939)
Application examples:
Traffic congestion: car flows on roads ⇔ size of queues.
Electrical networks: currents in the wires ⇔ voltagepotentials, etc.
Main question: How to compute (x∗, λ∗)?
Yu. Nesterov New primal-dual methods for functional constraints 5/19
![Page 34: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/34.jpg)
Lagrange multipliers: interpretation
Let I ⊆ {1, . . . ,m} be an arbitrary set of indexes.
Denote fI(x) = f0(x) +∑i∈I
λ(i)∗ fi (x). Consider the problem
PI : minx∈Q{fI(x) : fi (x) ≤ 0, i 6∈ I}.
Observation: in any case, x∗ is the optimal solution ofproblem PI .
Interpretation: λ(i)∗ are the shadow prices for resources.
(Kantorovich, 1939)
Application examples:
Traffic congestion: car flows on roads ⇔ size of queues.
Electrical networks: currents in the wires ⇔ voltagepotentials, etc.
Main question: How to compute (x∗, λ∗)?
Yu. Nesterov New primal-dual methods for functional constraints 5/19
![Page 35: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/35.jpg)
Lagrange multipliers: interpretation
Let I ⊆ {1, . . . ,m} be an arbitrary set of indexes.
Denote fI(x) = f0(x) +∑i∈I
λ(i)∗ fi (x). Consider the problem
PI : minx∈Q{fI(x) : fi (x) ≤ 0, i 6∈ I}.
Observation: in any case, x∗ is the optimal solution ofproblem PI .
Interpretation: λ(i)∗ are the shadow prices for resources.
(Kantorovich, 1939)
Application examples:
Traffic congestion: car flows on roads ⇔ size of queues.
Electrical networks: currents in the wires ⇔ voltagepotentials, etc.
Main question: How to compute (x∗, λ∗)?
Yu. Nesterov New primal-dual methods for functional constraints 5/19
![Page 36: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/36.jpg)
Lagrange multipliers: interpretation
Let I ⊆ {1, . . . ,m} be an arbitrary set of indexes.
Denote fI(x) = f0(x) +∑i∈I
λ(i)∗ fi (x). Consider the problem
PI : minx∈Q{fI(x) : fi (x) ≤ 0, i 6∈ I}.
Observation: in any case, x∗ is the optimal solution ofproblem PI .
Interpretation: λ(i)∗ are the shadow prices for resources.
(Kantorovich, 1939)
Application examples:
Traffic congestion: car flows on roads ⇔ size of queues.
Electrical networks: currents in the wires ⇔ voltagepotentials, etc.
Main question: How to compute (x∗, λ∗)?
Yu. Nesterov New primal-dual methods for functional constraints 5/19
![Page 37: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/37.jpg)
Lagrange multipliers: interpretation
Let I ⊆ {1, . . . ,m} be an arbitrary set of indexes.
Denote fI(x) = f0(x) +∑i∈I
λ(i)∗ fi (x). Consider the problem
PI : minx∈Q{fI(x) : fi (x) ≤ 0, i 6∈ I}.
Observation: in any case, x∗ is the optimal solution ofproblem PI .
Interpretation: λ(i)∗ are the shadow prices for resources.
(Kantorovich, 1939)
Application examples:
Traffic congestion: car flows on roads ⇔ size of queues.
Electrical networks: currents in the wires ⇔ voltagepotentials, etc.
Main question: How to compute (x∗, λ∗)?
Yu. Nesterov New primal-dual methods for functional constraints 5/19
![Page 38: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/38.jpg)
Lagrange multipliers: interpretation
Let I ⊆ {1, . . . ,m} be an arbitrary set of indexes.
Denote fI(x) = f0(x) +∑i∈I
λ(i)∗ fi (x). Consider the problem
PI : minx∈Q{fI(x) : fi (x) ≤ 0, i 6∈ I}.
Observation: in any case, x∗ is the optimal solution ofproblem PI .
Interpretation: λ(i)∗ are the shadow prices for resources.
(Kantorovich, 1939)
Application examples:
Traffic congestion: car flows on roads ⇔ size of queues.
Electrical networks: currents in the wires ⇔ voltagepotentials, etc.
Main question: How to compute (x∗, λ∗)?
Yu. Nesterov New primal-dual methods for functional constraints 5/19
![Page 39: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/39.jpg)
Lagrange multipliers: interpretation
Let I ⊆ {1, . . . ,m} be an arbitrary set of indexes.
Denote fI(x) = f0(x) +∑i∈I
λ(i)∗ fi (x). Consider the problem
PI : minx∈Q{fI(x) : fi (x) ≤ 0, i 6∈ I}.
Observation: in any case, x∗ is the optimal solution ofproblem PI .
Interpretation: λ(i)∗ are the shadow prices for resources.
(Kantorovich, 1939)
Application examples:
Traffic congestion: car flows on roads ⇔ size of queues.
Electrical networks: currents in the wires ⇔ voltagepotentials, etc.
Main question: How to compute (x∗, λ∗)?Yu. Nesterov New primal-dual methods for functional constraints 5/19
![Page 40: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/40.jpg)
Algebraic interpretation
Consider the Lagrangian L(x , λ) = f0(x) +m∑
i=1λ(i)fi (x).
Condition KKT(1): 〈∇f0(x∗) +m∑
i=1λ
(i)∗ ∇fi (x∗), x − x∗〉 ≥ 0,
∀x ∈ Q, implies
x∗ ∈ Arg minx∈QL(x , λ∗).
Define the dual function φ(λ) = minx∈QL(x , λ), λ ≥ 0. It is concave!
By Danskin’s Theorem, ∇φ(λ) = (f1(x(λ)), . . . , fm(x(λ)), withx(λ) ∈ Arg min
x∈QL(x , λ).
Conditions KKT(2,3): fi (x∗) ≤ 0, λ(i)∗ fi (x∗) = 0, i = 1, . . . ,m,
imply (x∗ = x(λ∗))
λ∗ ∈ Arg maxλ≥0
φ(λ).
Yu. Nesterov New primal-dual methods for functional constraints 6/19
![Page 41: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/41.jpg)
Algebraic interpretation
Consider the Lagrangian L(x , λ) = f0(x) +m∑
i=1λ(i)fi (x).
Condition KKT(1): 〈∇f0(x∗) +m∑
i=1λ
(i)∗ ∇fi (x∗), x − x∗〉 ≥ 0,
∀x ∈ Q, implies
x∗ ∈ Arg minx∈QL(x , λ∗).
Define the dual function φ(λ) = minx∈QL(x , λ), λ ≥ 0. It is concave!
By Danskin’s Theorem, ∇φ(λ) = (f1(x(λ)), . . . , fm(x(λ)), withx(λ) ∈ Arg min
x∈QL(x , λ).
Conditions KKT(2,3): fi (x∗) ≤ 0, λ(i)∗ fi (x∗) = 0, i = 1, . . . ,m,
imply (x∗ = x(λ∗))
λ∗ ∈ Arg maxλ≥0
φ(λ).
Yu. Nesterov New primal-dual methods for functional constraints 6/19
![Page 42: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/42.jpg)
Algebraic interpretation
Consider the Lagrangian L(x , λ) = f0(x) +m∑
i=1λ(i)fi (x).
Condition KKT(1): 〈∇f0(x∗) +m∑
i=1λ
(i)∗ ∇fi (x∗), x − x∗〉 ≥ 0,
∀x ∈ Q,
implies
x∗ ∈ Arg minx∈QL(x , λ∗).
Define the dual function φ(λ) = minx∈QL(x , λ), λ ≥ 0. It is concave!
By Danskin’s Theorem, ∇φ(λ) = (f1(x(λ)), . . . , fm(x(λ)), withx(λ) ∈ Arg min
x∈QL(x , λ).
Conditions KKT(2,3): fi (x∗) ≤ 0, λ(i)∗ fi (x∗) = 0, i = 1, . . . ,m,
imply (x∗ = x(λ∗))
λ∗ ∈ Arg maxλ≥0
φ(λ).
Yu. Nesterov New primal-dual methods for functional constraints 6/19
![Page 43: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/43.jpg)
Algebraic interpretation
Consider the Lagrangian L(x , λ) = f0(x) +m∑
i=1λ(i)fi (x).
Condition KKT(1): 〈∇f0(x∗) +m∑
i=1λ
(i)∗ ∇fi (x∗), x − x∗〉 ≥ 0,
∀x ∈ Q, implies
x∗ ∈ Arg minx∈QL(x , λ∗).
Define the dual function φ(λ) = minx∈QL(x , λ), λ ≥ 0. It is concave!
By Danskin’s Theorem, ∇φ(λ) = (f1(x(λ)), . . . , fm(x(λ)), withx(λ) ∈ Arg min
x∈QL(x , λ).
Conditions KKT(2,3): fi (x∗) ≤ 0, λ(i)∗ fi (x∗) = 0, i = 1, . . . ,m,
imply (x∗ = x(λ∗))
λ∗ ∈ Arg maxλ≥0
φ(λ).
Yu. Nesterov New primal-dual methods for functional constraints 6/19
![Page 44: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/44.jpg)
Algebraic interpretation
Consider the Lagrangian L(x , λ) = f0(x) +m∑
i=1λ(i)fi (x).
Condition KKT(1): 〈∇f0(x∗) +m∑
i=1λ
(i)∗ ∇fi (x∗), x − x∗〉 ≥ 0,
∀x ∈ Q, implies
x∗ ∈ Arg minx∈QL(x , λ∗).
Define the dual function φ(λ) = minx∈QL(x , λ), λ ≥ 0.
It is concave!
By Danskin’s Theorem, ∇φ(λ) = (f1(x(λ)), . . . , fm(x(λ)), withx(λ) ∈ Arg min
x∈QL(x , λ).
Conditions KKT(2,3): fi (x∗) ≤ 0, λ(i)∗ fi (x∗) = 0, i = 1, . . . ,m,
imply (x∗ = x(λ∗))
λ∗ ∈ Arg maxλ≥0
φ(λ).
Yu. Nesterov New primal-dual methods for functional constraints 6/19
![Page 45: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/45.jpg)
Algebraic interpretation
Consider the Lagrangian L(x , λ) = f0(x) +m∑
i=1λ(i)fi (x).
Condition KKT(1): 〈∇f0(x∗) +m∑
i=1λ
(i)∗ ∇fi (x∗), x − x∗〉 ≥ 0,
∀x ∈ Q, implies
x∗ ∈ Arg minx∈QL(x , λ∗).
Define the dual function φ(λ) = minx∈QL(x , λ), λ ≥ 0. It is concave!
By Danskin’s Theorem, ∇φ(λ) = (f1(x(λ)), . . . , fm(x(λ)), withx(λ) ∈ Arg min
x∈QL(x , λ).
Conditions KKT(2,3): fi (x∗) ≤ 0, λ(i)∗ fi (x∗) = 0, i = 1, . . . ,m,
imply (x∗ = x(λ∗))
λ∗ ∈ Arg maxλ≥0
φ(λ).
Yu. Nesterov New primal-dual methods for functional constraints 6/19
![Page 46: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/46.jpg)
Algebraic interpretation
Consider the Lagrangian L(x , λ) = f0(x) +m∑
i=1λ(i)fi (x).
Condition KKT(1): 〈∇f0(x∗) +m∑
i=1λ
(i)∗ ∇fi (x∗), x − x∗〉 ≥ 0,
∀x ∈ Q, implies
x∗ ∈ Arg minx∈QL(x , λ∗).
Define the dual function φ(λ) = minx∈QL(x , λ), λ ≥ 0. It is concave!
By Danskin’s Theorem, ∇φ(λ) = (f1(x(λ)), . . . , fm(x(λ)), withx(λ) ∈ Arg min
x∈QL(x , λ).
Conditions KKT(2,3): fi (x∗) ≤ 0, λ(i)∗ fi (x∗) = 0, i = 1, . . . ,m,
imply (x∗ = x(λ∗))
λ∗ ∈ Arg maxλ≥0
φ(λ).
Yu. Nesterov New primal-dual methods for functional constraints 6/19
![Page 47: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/47.jpg)
Algebraic interpretation
Consider the Lagrangian L(x , λ) = f0(x) +m∑
i=1λ(i)fi (x).
Condition KKT(1): 〈∇f0(x∗) +m∑
i=1λ
(i)∗ ∇fi (x∗), x − x∗〉 ≥ 0,
∀x ∈ Q, implies
x∗ ∈ Arg minx∈QL(x , λ∗).
Define the dual function φ(λ) = minx∈QL(x , λ), λ ≥ 0. It is concave!
By Danskin’s Theorem, ∇φ(λ) = (f1(x(λ)), . . . , fm(x(λ)), withx(λ) ∈ Arg min
x∈QL(x , λ).
Conditions KKT(2,3): fi (x∗) ≤ 0, λ(i)∗ fi (x∗) = 0, i = 1, . . . ,m,
imply (x∗ = x(λ∗))
λ∗ ∈ Arg maxλ≥0
φ(λ).
Yu. Nesterov New primal-dual methods for functional constraints 6/19
![Page 48: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/48.jpg)
Algebraic interpretation
Consider the Lagrangian L(x , λ) = f0(x) +m∑
i=1λ(i)fi (x).
Condition KKT(1): 〈∇f0(x∗) +m∑
i=1λ
(i)∗ ∇fi (x∗), x − x∗〉 ≥ 0,
∀x ∈ Q, implies
x∗ ∈ Arg minx∈QL(x , λ∗).
Define the dual function φ(λ) = minx∈QL(x , λ), λ ≥ 0. It is concave!
By Danskin’s Theorem, ∇φ(λ) = (f1(x(λ)), . . . , fm(x(λ)), withx(λ) ∈ Arg min
x∈QL(x , λ).
Conditions KKT(2,3): fi (x∗) ≤ 0, λ(i)∗ fi (x∗) = 0, i = 1, . . . ,m,
imply (x∗ = x(λ∗))
λ∗ ∈ Arg maxλ≥0
φ(λ).
Yu. Nesterov New primal-dual methods for functional constraints 6/19
![Page 49: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/49.jpg)
Algorithmic aspects
Main idea: solve the dual problem maxλ≥0
φ(λ)
by the subgradient method:
1. Compute x(λk) and define ∇φ(λk) = (f1(x(λk)), . . . , fm(x(λk))).
2. Update λk+1 = ProjectRn+
(λk + hk∇φ(λk)).
Stepsizes hk > 0 are defined in the usual way.
Main difficulties:
Each iteration is time consuming.
Unclear termination criterion.
Low rate of convergence (O(
1ε2
)upper-level iterations).
Yu. Nesterov New primal-dual methods for functional constraints 7/19
![Page 50: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/50.jpg)
Algorithmic aspects
Main idea: solve the dual problem maxλ≥0
φ(λ)
by the subgradient method:
1. Compute x(λk) and define ∇φ(λk) = (f1(x(λk)), . . . , fm(x(λk))).
2. Update λk+1 = ProjectRn+
(λk + hk∇φ(λk)).
Stepsizes hk > 0 are defined in the usual way.
Main difficulties:
Each iteration is time consuming.
Unclear termination criterion.
Low rate of convergence (O(
1ε2
)upper-level iterations).
Yu. Nesterov New primal-dual methods for functional constraints 7/19
![Page 51: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/51.jpg)
Algorithmic aspects
Main idea: solve the dual problem maxλ≥0
φ(λ)
by the subgradient method:
1. Compute x(λk) and define ∇φ(λk) = (f1(x(λk)), . . . , fm(x(λk))).
2. Update λk+1 = ProjectRn+
(λk + hk∇φ(λk)).
Stepsizes hk > 0 are defined in the usual way.
Main difficulties:
Each iteration is time consuming.
Unclear termination criterion.
Low rate of convergence (O(
1ε2
)upper-level iterations).
Yu. Nesterov New primal-dual methods for functional constraints 7/19
![Page 52: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/52.jpg)
Algorithmic aspects
Main idea: solve the dual problem maxλ≥0
φ(λ)
by the subgradient method:
1. Compute x(λk) and define ∇φ(λk) = (f1(x(λk)), . . . , fm(x(λk))).
2. Update λk+1 = ProjectRn+
(λk + hk∇φ(λk)).
Stepsizes hk > 0 are defined in the usual way.
Main difficulties:
Each iteration is time consuming.
Unclear termination criterion.
Low rate of convergence (O(
1ε2
)upper-level iterations).
Yu. Nesterov New primal-dual methods for functional constraints 7/19
![Page 53: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/53.jpg)
Algorithmic aspects
Main idea: solve the dual problem maxλ≥0
φ(λ)
by the subgradient method:
1. Compute x(λk) and define ∇φ(λk) = (f1(x(λk)), . . . , fm(x(λk))).
2. Update λk+1 = ProjectRn+
(λk + hk∇φ(λk)).
Stepsizes hk > 0 are defined in the usual way.
Main difficulties:
Each iteration is time consuming.
Unclear termination criterion.
Low rate of convergence (O(
1ε2
)upper-level iterations).
Yu. Nesterov New primal-dual methods for functional constraints 7/19
![Page 54: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/54.jpg)
Algorithmic aspects
Main idea: solve the dual problem maxλ≥0
φ(λ)
by the subgradient method:
1. Compute x(λk) and define ∇φ(λk) = (f1(x(λk)), . . . , fm(x(λk))).
2. Update λk+1 = ProjectRn+
(λk + hk∇φ(λk)).
Stepsizes hk > 0 are defined in the usual way.
Main difficulties:
Each iteration is time consuming.
Unclear termination criterion.
Low rate of convergence (O(
1ε2
)upper-level iterations).
Yu. Nesterov New primal-dual methods for functional constraints 7/19
![Page 55: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/55.jpg)
Algorithmic aspects
Main idea: solve the dual problem maxλ≥0
φ(λ)
by the subgradient method:
1. Compute x(λk) and define ∇φ(λk) = (f1(x(λk)), . . . , fm(x(λk))).
2. Update λk+1 = ProjectRn+
(λk + hk∇φ(λk)).
Stepsizes hk > 0 are defined in the usual way.
Main difficulties:
Each iteration is time consuming.
Unclear termination criterion.
Low rate of convergence (O(
1ε2
)upper-level iterations).
Yu. Nesterov New primal-dual methods for functional constraints 7/19
![Page 56: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/56.jpg)
Algorithmic aspects
Main idea: solve the dual problem maxλ≥0
φ(λ)
by the subgradient method:
1. Compute x(λk) and define ∇φ(λk) = (f1(x(λk)), . . . , fm(x(λk))).
2. Update λk+1 = ProjectRn+
(λk + hk∇φ(λk)).
Stepsizes hk > 0 are defined in the usual way.
Main difficulties:
Each iteration is time consuming.
Unclear termination criterion.
Low rate of convergence (O(
1ε2
)upper-level iterations).
Yu. Nesterov New primal-dual methods for functional constraints 7/19
![Page 57: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/57.jpg)
Algorithmic aspects
Main idea: solve the dual problem maxλ≥0
φ(λ)
by the subgradient method:
1. Compute x(λk) and define ∇φ(λk) = (f1(x(λk)), . . . , fm(x(λk))).
2. Update λk+1 = ProjectRn+
(λk + hk∇φ(λk)).
Stepsizes hk > 0 are defined in the usual way.
Main difficulties:
Each iteration is time consuming.
Unclear termination criterion.
Low rate of convergence (O(
1ε2
)upper-level iterations).
Yu. Nesterov New primal-dual methods for functional constraints 7/19
![Page 58: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/58.jpg)
Algorithmic aspects
Main idea: solve the dual problem maxλ≥0
φ(λ)
by the subgradient method:
1. Compute x(λk) and define ∇φ(λk) = (f1(x(λk)), . . . , fm(x(λk))).
2. Update λk+1 = ProjectRn+
(λk + hk∇φ(λk)).
Stepsizes hk > 0 are defined in the usual way.
Main difficulties:
Each iteration is time consuming.
Unclear termination criterion.
Low rate of convergence
(O(
1ε2
)upper-level iterations).
Yu. Nesterov New primal-dual methods for functional constraints 7/19
![Page 59: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/59.jpg)
Algorithmic aspects
Main idea: solve the dual problem maxλ≥0
φ(λ)
by the subgradient method:
1. Compute x(λk) and define ∇φ(λk) = (f1(x(λk)), . . . , fm(x(λk))).
2. Update λk+1 = ProjectRn+
(λk + hk∇φ(λk)).
Stepsizes hk > 0 are defined in the usual way.
Main difficulties:
Each iteration is time consuming.
Unclear termination criterion.
Low rate of convergence (O(
1ε2
)upper-level iterations).
Yu. Nesterov New primal-dual methods for functional constraints 7/19
![Page 60: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/60.jpg)
Augmented Lagrangian (1970’s)[Hestenes, Powell, Rockafellar, Polyak, Bertsekas, . . .]
Define the Augmented Lagrangian
L̂K (x , λ) = f0(x) + 12K
m∑i=1
(λ(i) + Kfi (x)
)2+− 1
2K ‖λ‖22, λ ∈ Rm,
where K > 0 is a penalty parameter.
Consider the dual function φ̂(λ) = minx∈QL̂(x , λ).
Main properties. Function φ̂ is concave. Its gradient isLipschitz continuous with constant 1
K .
Its unconstrained maximum is attained at the optimal dualsolution.
The corresponding point x̂(λ∗) is the optimal primal solution.
Hint: Check that the equation(λ(i) + Kfi (x)
)+
= λ(i)
is equivalent to KKT(2,3).
Yu. Nesterov New primal-dual methods for functional constraints 8/19
![Page 61: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/61.jpg)
Augmented Lagrangian (1970’s)[Hestenes, Powell, Rockafellar, Polyak, Bertsekas, . . .]
Define the Augmented Lagrangian
L̂K (x , λ) = f0(x) + 12K
m∑i=1
(λ(i) + Kfi (x)
)2+− 1
2K ‖λ‖22, λ ∈ Rm,
where K > 0 is a penalty parameter.
Consider the dual function φ̂(λ) = minx∈QL̂(x , λ).
Main properties. Function φ̂ is concave. Its gradient isLipschitz continuous with constant 1
K .
Its unconstrained maximum is attained at the optimal dualsolution.
The corresponding point x̂(λ∗) is the optimal primal solution.
Hint: Check that the equation(λ(i) + Kfi (x)
)+
= λ(i)
is equivalent to KKT(2,3).
Yu. Nesterov New primal-dual methods for functional constraints 8/19
![Page 62: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/62.jpg)
Augmented Lagrangian (1970’s)[Hestenes, Powell, Rockafellar, Polyak, Bertsekas, . . .]
Define the Augmented Lagrangian
L̂K (x , λ) = f0(x) + 12K
m∑i=1
(λ(i) + Kfi (x)
)2+− 1
2K ‖λ‖22, λ ∈ Rm,
where K > 0 is a penalty parameter.
Consider the dual function φ̂(λ) = minx∈QL̂(x , λ).
Main properties. Function φ̂ is concave. Its gradient isLipschitz continuous with constant 1
K .
Its unconstrained maximum is attained at the optimal dualsolution.
The corresponding point x̂(λ∗) is the optimal primal solution.
Hint: Check that the equation(λ(i) + Kfi (x)
)+
= λ(i)
is equivalent to KKT(2,3).
Yu. Nesterov New primal-dual methods for functional constraints 8/19
![Page 63: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/63.jpg)
Augmented Lagrangian (1970’s)[Hestenes, Powell, Rockafellar, Polyak, Bertsekas, . . .]
Define the Augmented Lagrangian
L̂K (x , λ) = f0(x) + 12K
m∑i=1
(λ(i) + Kfi (x)
)2+− 1
2K ‖λ‖22, λ ∈ Rm,
where K > 0 is a penalty parameter.
Consider the dual function φ̂(λ) = minx∈QL̂(x , λ).
Main properties. Function φ̂ is concave.
Its gradient isLipschitz continuous with constant 1
K .
Its unconstrained maximum is attained at the optimal dualsolution.
The corresponding point x̂(λ∗) is the optimal primal solution.
Hint: Check that the equation(λ(i) + Kfi (x)
)+
= λ(i)
is equivalent to KKT(2,3).
Yu. Nesterov New primal-dual methods for functional constraints 8/19
![Page 64: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/64.jpg)
Augmented Lagrangian (1970’s)[Hestenes, Powell, Rockafellar, Polyak, Bertsekas, . . .]
Define the Augmented Lagrangian
L̂K (x , λ) = f0(x) + 12K
m∑i=1
(λ(i) + Kfi (x)
)2+− 1
2K ‖λ‖22, λ ∈ Rm,
where K > 0 is a penalty parameter.
Consider the dual function φ̂(λ) = minx∈QL̂(x , λ).
Main properties. Function φ̂ is concave. Its gradient isLipschitz continuous with constant 1
K .
Its unconstrained maximum is attained at the optimal dualsolution.
The corresponding point x̂(λ∗) is the optimal primal solution.
Hint: Check that the equation(λ(i) + Kfi (x)
)+
= λ(i)
is equivalent to KKT(2,3).
Yu. Nesterov New primal-dual methods for functional constraints 8/19
![Page 65: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/65.jpg)
Augmented Lagrangian (1970’s)[Hestenes, Powell, Rockafellar, Polyak, Bertsekas, . . .]
Define the Augmented Lagrangian
L̂K (x , λ) = f0(x) + 12K
m∑i=1
(λ(i) + Kfi (x)
)2+− 1
2K ‖λ‖22, λ ∈ Rm,
where K > 0 is a penalty parameter.
Consider the dual function φ̂(λ) = minx∈QL̂(x , λ).
Main properties. Function φ̂ is concave. Its gradient isLipschitz continuous with constant 1
K .
Its unconstrained maximum is attained at the optimal dualsolution.
The corresponding point x̂(λ∗) is the optimal primal solution.
Hint: Check that the equation(λ(i) + Kfi (x)
)+
= λ(i)
is equivalent to KKT(2,3).
Yu. Nesterov New primal-dual methods for functional constraints 8/19
![Page 66: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/66.jpg)
Augmented Lagrangian (1970’s)[Hestenes, Powell, Rockafellar, Polyak, Bertsekas, . . .]
Define the Augmented Lagrangian
L̂K (x , λ) = f0(x) + 12K
m∑i=1
(λ(i) + Kfi (x)
)2+− 1
2K ‖λ‖22, λ ∈ Rm,
where K > 0 is a penalty parameter.
Consider the dual function φ̂(λ) = minx∈QL̂(x , λ).
Main properties. Function φ̂ is concave. Its gradient isLipschitz continuous with constant 1
K .
Its unconstrained maximum is attained at the optimal dualsolution.
The corresponding point x̂(λ∗) is the optimal primal solution.
Hint: Check that the equation(λ(i) + Kfi (x)
)+
= λ(i)
is equivalent to KKT(2,3).
Yu. Nesterov New primal-dual methods for functional constraints 8/19
![Page 67: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/67.jpg)
Augmented Lagrangian (1970’s)[Hestenes, Powell, Rockafellar, Polyak, Bertsekas, . . .]
Define the Augmented Lagrangian
L̂K (x , λ) = f0(x) + 12K
m∑i=1
(λ(i) + Kfi (x)
)2+− 1
2K ‖λ‖22, λ ∈ Rm,
where K > 0 is a penalty parameter.
Consider the dual function φ̂(λ) = minx∈QL̂(x , λ).
Main properties. Function φ̂ is concave. Its gradient isLipschitz continuous with constant 1
K .
Its unconstrained maximum is attained at the optimal dualsolution.
The corresponding point x̂(λ∗) is the optimal primal solution.
Hint: Check that the equation(λ(i) + Kfi (x)
)+
= λ(i)
is equivalent to KKT(2,3).
Yu. Nesterov New primal-dual methods for functional constraints 8/19
![Page 68: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/68.jpg)
Method of Augmented Lagrangians
Note that ∇φ̂(λ) = 1K
(λ(i) + Kfi (x)
)+− 1
K λ.
Therefore, the usual gradient method λk+1 = λk + K∇φ̂(λk) isexactly as follows:
Method: λk+1 = (λk + Kf (x̂(λk)))+.
Advantage: Fast convergence of the dual process.
Disadvantages:
Difficult iteration.
Unclear termination.
No global complexity analysis.
Do we have an alternative?
Yu. Nesterov New primal-dual methods for functional constraints 9/19
![Page 69: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/69.jpg)
Method of Augmented Lagrangians
Note that ∇φ̂(λ) = 1K
(λ(i) + Kfi (x)
)+− 1
K λ.
Therefore, the usual gradient method λk+1 = λk + K∇φ̂(λk) isexactly as follows:
Method: λk+1 = (λk + Kf (x̂(λk)))+.
Advantage: Fast convergence of the dual process.
Disadvantages:
Difficult iteration.
Unclear termination.
No global complexity analysis.
Do we have an alternative?
Yu. Nesterov New primal-dual methods for functional constraints 9/19
![Page 70: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/70.jpg)
Method of Augmented Lagrangians
Note that ∇φ̂(λ) = 1K
(λ(i) + Kfi (x)
)+− 1
K λ.
Therefore, the usual gradient method λk+1 = λk + K∇φ̂(λk) isexactly as follows:
Method: λk+1 = (λk + Kf (x̂(λk)))+.
Advantage: Fast convergence of the dual process.
Disadvantages:
Difficult iteration.
Unclear termination.
No global complexity analysis.
Do we have an alternative?
Yu. Nesterov New primal-dual methods for functional constraints 9/19
![Page 71: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/71.jpg)
Method of Augmented Lagrangians
Note that ∇φ̂(λ) = 1K
(λ(i) + Kfi (x)
)+− 1
K λ.
Therefore, the usual gradient method λk+1 = λk + K∇φ̂(λk) isexactly as follows:
Method: λk+1 = (λk + Kf (x̂(λk)))+.
Advantage: Fast convergence of the dual process.
Disadvantages:
Difficult iteration.
Unclear termination.
No global complexity analysis.
Do we have an alternative?
Yu. Nesterov New primal-dual methods for functional constraints 9/19
![Page 72: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/72.jpg)
Method of Augmented Lagrangians
Note that ∇φ̂(λ) = 1K
(λ(i) + Kfi (x)
)+− 1
K λ.
Therefore, the usual gradient method λk+1 = λk + K∇φ̂(λk) isexactly as follows:
Method: λk+1 = (λk + Kf (x̂(λk)))+.
Advantage: Fast convergence of the dual process.
Disadvantages:
Difficult iteration.
Unclear termination.
No global complexity analysis.
Do we have an alternative?
Yu. Nesterov New primal-dual methods for functional constraints 9/19
![Page 73: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/73.jpg)
Method of Augmented Lagrangians
Note that ∇φ̂(λ) = 1K
(λ(i) + Kfi (x)
)+− 1
K λ.
Therefore, the usual gradient method λk+1 = λk + K∇φ̂(λk) isexactly as follows:
Method: λk+1 = (λk + Kf (x̂(λk)))+.
Advantage: Fast convergence of the dual process.
Disadvantages:
Difficult iteration.
Unclear termination.
No global complexity analysis.
Do we have an alternative?
Yu. Nesterov New primal-dual methods for functional constraints 9/19
![Page 74: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/74.jpg)
Method of Augmented Lagrangians
Note that ∇φ̂(λ) = 1K
(λ(i) + Kfi (x)
)+− 1
K λ.
Therefore, the usual gradient method λk+1 = λk + K∇φ̂(λk) isexactly as follows:
Method: λk+1 = (λk + Kf (x̂(λk)))+.
Advantage: Fast convergence of the dual process.
Disadvantages:
Difficult iteration.
Unclear termination.
No global complexity analysis.
Do we have an alternative?
Yu. Nesterov New primal-dual methods for functional constraints 9/19
![Page 75: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/75.jpg)
Method of Augmented Lagrangians
Note that ∇φ̂(λ) = 1K
(λ(i) + Kfi (x)
)+− 1
K λ.
Therefore, the usual gradient method λk+1 = λk + K∇φ̂(λk) isexactly as follows:
Method: λk+1 = (λk + Kf (x̂(λk)))+.
Advantage: Fast convergence of the dual process.
Disadvantages:
Difficult iteration.
Unclear termination.
No global complexity analysis.
Do we have an alternative?
Yu. Nesterov New primal-dual methods for functional constraints 9/19
![Page 76: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/76.jpg)
Method of Augmented Lagrangians
Note that ∇φ̂(λ) = 1K
(λ(i) + Kfi (x)
)+− 1
K λ.
Therefore, the usual gradient method λk+1 = λk + K∇φ̂(λk) isexactly as follows:
Method: λk+1 = (λk + Kf (x̂(λk)))+.
Advantage: Fast convergence of the dual process.
Disadvantages:
Difficult iteration.
Unclear termination.
No global complexity analysis.
Do we have an alternative?
Yu. Nesterov New primal-dual methods for functional constraints 9/19
![Page 77: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/77.jpg)
Method of Augmented Lagrangians
Note that ∇φ̂(λ) = 1K
(λ(i) + Kfi (x)
)+− 1
K λ.
Therefore, the usual gradient method λk+1 = λk + K∇φ̂(λk) isexactly as follows:
Method: λk+1 = (λk + Kf (x̂(λk)))+.
Advantage: Fast convergence of the dual process.
Disadvantages:
Difficult iteration.
Unclear termination.
No global complexity analysis.
Do we have an alternative?
Yu. Nesterov New primal-dual methods for functional constraints 9/19
![Page 78: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/78.jpg)
Problem formulation
Problem: f ∗ = infx∈Q{f0(x) : fi (x) ≤ 0, i = 1, . . . ,m}, where
fi (x), i = 0, . . . ,m, are closed convex functions on Q endowedwith a first-order black-box oracles,
Q ⊂ E is a bounded simple closed convex set. (We can solvesome auxiliary optimization problems over Q.)
Defining the Lagrangian
L(x , λ) = f0(x) +m∑
i=1λ(i)fi (x), x ∈ Q, λ ∈ Rm
+,
we can introduce the Lagrangian dual problem f∗def= sup
λ∈Rm+
φ(λ),
where φ(λ)def= inf
x∈QL(x , λ).
Clearly, f ∗ ≥ f∗. Later, we will show f ∗ = f∗ algorithmically.
Yu. Nesterov New primal-dual methods for functional constraints 10/19
![Page 79: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/79.jpg)
Problem formulation
Problem: f ∗ = infx∈Q{f0(x) : fi (x) ≤ 0, i = 1, . . . ,m},
where
fi (x), i = 0, . . . ,m, are closed convex functions on Q endowedwith a first-order black-box oracles,
Q ⊂ E is a bounded simple closed convex set. (We can solvesome auxiliary optimization problems over Q.)
Defining the Lagrangian
L(x , λ) = f0(x) +m∑
i=1λ(i)fi (x), x ∈ Q, λ ∈ Rm
+,
we can introduce the Lagrangian dual problem f∗def= sup
λ∈Rm+
φ(λ),
where φ(λ)def= inf
x∈QL(x , λ).
Clearly, f ∗ ≥ f∗. Later, we will show f ∗ = f∗ algorithmically.
Yu. Nesterov New primal-dual methods for functional constraints 10/19
![Page 80: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/80.jpg)
Problem formulation
Problem: f ∗ = infx∈Q{f0(x) : fi (x) ≤ 0, i = 1, . . . ,m}, where
fi (x), i = 0, . . . ,m, are closed convex functions on Q endowedwith a first-order black-box oracles,
Q ⊂ E is a bounded simple closed convex set. (We can solvesome auxiliary optimization problems over Q.)
Defining the Lagrangian
L(x , λ) = f0(x) +m∑
i=1λ(i)fi (x), x ∈ Q, λ ∈ Rm
+,
we can introduce the Lagrangian dual problem f∗def= sup
λ∈Rm+
φ(λ),
where φ(λ)def= inf
x∈QL(x , λ).
Clearly, f ∗ ≥ f∗. Later, we will show f ∗ = f∗ algorithmically.
Yu. Nesterov New primal-dual methods for functional constraints 10/19
![Page 81: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/81.jpg)
Problem formulation
Problem: f ∗ = infx∈Q{f0(x) : fi (x) ≤ 0, i = 1, . . . ,m}, where
fi (x), i = 0, . . . ,m, are closed convex functions on Q endowedwith a first-order black-box oracles,
Q ⊂ E is a bounded simple closed convex set.
(We can solvesome auxiliary optimization problems over Q.)
Defining the Lagrangian
L(x , λ) = f0(x) +m∑
i=1λ(i)fi (x), x ∈ Q, λ ∈ Rm
+,
we can introduce the Lagrangian dual problem f∗def= sup
λ∈Rm+
φ(λ),
where φ(λ)def= inf
x∈QL(x , λ).
Clearly, f ∗ ≥ f∗. Later, we will show f ∗ = f∗ algorithmically.
Yu. Nesterov New primal-dual methods for functional constraints 10/19
![Page 82: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/82.jpg)
Problem formulation
Problem: f ∗ = infx∈Q{f0(x) : fi (x) ≤ 0, i = 1, . . . ,m}, where
fi (x), i = 0, . . . ,m, are closed convex functions on Q endowedwith a first-order black-box oracles,
Q ⊂ E is a bounded simple closed convex set. (We can solvesome auxiliary optimization problems over Q.)
Defining the Lagrangian
L(x , λ) = f0(x) +m∑
i=1λ(i)fi (x), x ∈ Q, λ ∈ Rm
+,
we can introduce the Lagrangian dual problem f∗def= sup
λ∈Rm+
φ(λ),
where φ(λ)def= inf
x∈QL(x , λ).
Clearly, f ∗ ≥ f∗. Later, we will show f ∗ = f∗ algorithmically.
Yu. Nesterov New primal-dual methods for functional constraints 10/19
![Page 83: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/83.jpg)
Problem formulation
Problem: f ∗ = infx∈Q{f0(x) : fi (x) ≤ 0, i = 1, . . . ,m}, where
fi (x), i = 0, . . . ,m, are closed convex functions on Q endowedwith a first-order black-box oracles,
Q ⊂ E is a bounded simple closed convex set. (We can solvesome auxiliary optimization problems over Q.)
Defining the Lagrangian
L(x , λ) = f0(x) +m∑
i=1λ(i)fi (x), x ∈ Q, λ ∈ Rm
+,
we can introduce the Lagrangian dual problem f∗def= sup
λ∈Rm+
φ(λ),
where φ(λ)def= inf
x∈QL(x , λ).
Clearly, f ∗ ≥ f∗. Later, we will show f ∗ = f∗ algorithmically.
Yu. Nesterov New primal-dual methods for functional constraints 10/19
![Page 84: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/84.jpg)
Problem formulation
Problem: f ∗ = infx∈Q{f0(x) : fi (x) ≤ 0, i = 1, . . . ,m}, where
fi (x), i = 0, . . . ,m, are closed convex functions on Q endowedwith a first-order black-box oracles,
Q ⊂ E is a bounded simple closed convex set. (We can solvesome auxiliary optimization problems over Q.)
Defining the Lagrangian
L(x , λ) = f0(x) +m∑
i=1λ(i)fi (x), x ∈ Q, λ ∈ Rm
+,
we can introduce the Lagrangian dual problem f∗def= sup
λ∈Rm+
φ(λ),
where φ(λ)def= inf
x∈QL(x , λ).
Clearly, f ∗ ≥ f∗. Later, we will show f ∗ = f∗ algorithmically.
Yu. Nesterov New primal-dual methods for functional constraints 10/19
![Page 85: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/85.jpg)
Problem formulation
Problem: f ∗ = infx∈Q{f0(x) : fi (x) ≤ 0, i = 1, . . . ,m}, where
fi (x), i = 0, . . . ,m, are closed convex functions on Q endowedwith a first-order black-box oracles,
Q ⊂ E is a bounded simple closed convex set. (We can solvesome auxiliary optimization problems over Q.)
Defining the Lagrangian
L(x , λ) = f0(x) +m∑
i=1λ(i)fi (x), x ∈ Q, λ ∈ Rm
+,
we can introduce the Lagrangian dual problem f∗def= sup
λ∈Rm+
φ(λ),
where φ(λ)def= inf
x∈QL(x , λ).
Clearly, f ∗ ≥ f∗. Later, we will show f ∗ = f∗ algorithmically.
Yu. Nesterov New primal-dual methods for functional constraints 10/19
![Page 86: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/86.jpg)
Problem formulation
Problem: f ∗ = infx∈Q{f0(x) : fi (x) ≤ 0, i = 1, . . . ,m}, where
fi (x), i = 0, . . . ,m, are closed convex functions on Q endowedwith a first-order black-box oracles,
Q ⊂ E is a bounded simple closed convex set. (We can solvesome auxiliary optimization problems over Q.)
Defining the Lagrangian
L(x , λ) = f0(x) +m∑
i=1λ(i)fi (x), x ∈ Q, λ ∈ Rm
+,
we can introduce the Lagrangian dual problem f∗def= sup
λ∈Rm+
φ(λ),
where φ(λ)def= inf
x∈QL(x , λ).
Clearly, f ∗ ≥ f∗.
Later, we will show f ∗ = f∗ algorithmically.
Yu. Nesterov New primal-dual methods for functional constraints 10/19
![Page 87: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/87.jpg)
Problem formulation
Problem: f ∗ = infx∈Q{f0(x) : fi (x) ≤ 0, i = 1, . . . ,m}, where
fi (x), i = 0, . . . ,m, are closed convex functions on Q endowedwith a first-order black-box oracles,
Q ⊂ E is a bounded simple closed convex set. (We can solvesome auxiliary optimization problems over Q.)
Defining the Lagrangian
L(x , λ) = f0(x) +m∑
i=1λ(i)fi (x), x ∈ Q, λ ∈ Rm
+,
we can introduce the Lagrangian dual problem f∗def= sup
λ∈Rm+
φ(λ),
where φ(λ)def= inf
x∈QL(x , λ).
Clearly, f ∗ ≥ f∗. Later, we will show f ∗ = f∗ algorithmically.Yu. Nesterov New primal-dual methods for functional constraints 10/19
![Page 88: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/88.jpg)
Bregman distances
Prox-function: d(·) is strongly convex on Q with parameter one:d(y) ≥ d(x) + 〈∇d(x), y − x〉+ 1
2‖y − x‖2, x , y ∈ Q.
Denote by x0 the prox-center of the set Q: x0 = arg minx∈Q
d(x).
Assume d(x0) = 0.
Bregman distance:
β(x , y) = d(y)− d(x)− 〈∇d(x), y − x〉, x , y ∈ Q.
Clearly, β(x , y) ≥ 12‖x − y‖2 for all x , y ∈ Q.
Bregman mapping: for x ∈ Q, g ∈ E ∗ and h > 0 define
Bh(x , g) = arg miny∈Q{h〈g , y − x〉+ β(x , y)}.
Examples: Euclidean distance, Entropy distance, etc.
Yu. Nesterov New primal-dual methods for functional constraints 11/19
![Page 89: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/89.jpg)
Bregman distances
Prox-function: d(·) is strongly convex on Q with parameter one:d(y) ≥ d(x) + 〈∇d(x), y − x〉+ 1
2‖y − x‖2, x , y ∈ Q.
Denote by x0 the prox-center of the set Q: x0 = arg minx∈Q
d(x).
Assume d(x0) = 0.
Bregman distance:
β(x , y) = d(y)− d(x)− 〈∇d(x), y − x〉, x , y ∈ Q.
Clearly, β(x , y) ≥ 12‖x − y‖2 for all x , y ∈ Q.
Bregman mapping: for x ∈ Q, g ∈ E ∗ and h > 0 define
Bh(x , g) = arg miny∈Q{h〈g , y − x〉+ β(x , y)}.
Examples: Euclidean distance, Entropy distance, etc.
Yu. Nesterov New primal-dual methods for functional constraints 11/19
![Page 90: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/90.jpg)
Bregman distances
Prox-function: d(·) is strongly convex on Q with parameter one:d(y) ≥ d(x) + 〈∇d(x), y − x〉+ 1
2‖y − x‖2, x , y ∈ Q.
Denote by x0 the prox-center of the set Q: x0 = arg minx∈Q
d(x).
Assume d(x0) = 0.
Bregman distance:
β(x , y) = d(y)− d(x)− 〈∇d(x), y − x〉, x , y ∈ Q.
Clearly, β(x , y) ≥ 12‖x − y‖2 for all x , y ∈ Q.
Bregman mapping: for x ∈ Q, g ∈ E ∗ and h > 0 define
Bh(x , g) = arg miny∈Q{h〈g , y − x〉+ β(x , y)}.
Examples: Euclidean distance, Entropy distance, etc.
Yu. Nesterov New primal-dual methods for functional constraints 11/19
![Page 91: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/91.jpg)
Bregman distances
Prox-function: d(·) is strongly convex on Q with parameter one:d(y) ≥ d(x) + 〈∇d(x), y − x〉+ 1
2‖y − x‖2, x , y ∈ Q.
Denote by x0 the prox-center of the set Q: x0 = arg minx∈Q
d(x).
Assume d(x0) = 0.
Bregman distance:
β(x , y) = d(y)− d(x)− 〈∇d(x), y − x〉, x , y ∈ Q.
Clearly, β(x , y) ≥ 12‖x − y‖2 for all x , y ∈ Q.
Bregman mapping: for x ∈ Q, g ∈ E ∗ and h > 0 define
Bh(x , g) = arg miny∈Q{h〈g , y − x〉+ β(x , y)}.
Examples: Euclidean distance, Entropy distance, etc.
Yu. Nesterov New primal-dual methods for functional constraints 11/19
![Page 92: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/92.jpg)
Bregman distances
Prox-function: d(·) is strongly convex on Q with parameter one:d(y) ≥ d(x) + 〈∇d(x), y − x〉+ 1
2‖y − x‖2, x , y ∈ Q.
Denote by x0 the prox-center of the set Q: x0 = arg minx∈Q
d(x).
Assume d(x0) = 0.
Bregman distance:
β(x , y) = d(y)− d(x)− 〈∇d(x), y − x〉, x , y ∈ Q.
Clearly, β(x , y) ≥ 12‖x − y‖2 for all x , y ∈ Q.
Bregman mapping: for x ∈ Q, g ∈ E ∗ and h > 0 define
Bh(x , g) = arg miny∈Q{h〈g , y − x〉+ β(x , y)}.
Examples: Euclidean distance, Entropy distance, etc.
Yu. Nesterov New primal-dual methods for functional constraints 11/19
![Page 93: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/93.jpg)
Bregman distances
Prox-function: d(·) is strongly convex on Q with parameter one:d(y) ≥ d(x) + 〈∇d(x), y − x〉+ 1
2‖y − x‖2, x , y ∈ Q.
Denote by x0 the prox-center of the set Q: x0 = arg minx∈Q
d(x).
Assume d(x0) = 0.
Bregman distance:
β(x , y) = d(y)− d(x)− 〈∇d(x), y − x〉, x , y ∈ Q.
Clearly, β(x , y) ≥ 12‖x − y‖2 for all x , y ∈ Q.
Bregman mapping: for x ∈ Q, g ∈ E ∗ and h > 0 define
Bh(x , g) = arg miny∈Q{h〈g , y − x〉+ β(x , y)}.
Examples: Euclidean distance, Entropy distance, etc.
Yu. Nesterov New primal-dual methods for functional constraints 11/19
![Page 94: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/94.jpg)
Bregman distances
Prox-function: d(·) is strongly convex on Q with parameter one:d(y) ≥ d(x) + 〈∇d(x), y − x〉+ 1
2‖y − x‖2, x , y ∈ Q.
Denote by x0 the prox-center of the set Q: x0 = arg minx∈Q
d(x).
Assume d(x0) = 0.
Bregman distance:
β(x , y) = d(y)− d(x)− 〈∇d(x), y − x〉, x , y ∈ Q.
Clearly, β(x , y) ≥ 12‖x − y‖2 for all x , y ∈ Q.
Bregman mapping: for x ∈ Q, g ∈ E ∗ and h > 0
define
Bh(x , g) = arg miny∈Q{h〈g , y − x〉+ β(x , y)}.
Examples: Euclidean distance, Entropy distance, etc.
Yu. Nesterov New primal-dual methods for functional constraints 11/19
![Page 95: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/95.jpg)
Bregman distances
Prox-function: d(·) is strongly convex on Q with parameter one:d(y) ≥ d(x) + 〈∇d(x), y − x〉+ 1
2‖y − x‖2, x , y ∈ Q.
Denote by x0 the prox-center of the set Q: x0 = arg minx∈Q
d(x).
Assume d(x0) = 0.
Bregman distance:
β(x , y) = d(y)− d(x)− 〈∇d(x), y − x〉, x , y ∈ Q.
Clearly, β(x , y) ≥ 12‖x − y‖2 for all x , y ∈ Q.
Bregman mapping: for x ∈ Q, g ∈ E ∗ and h > 0 define
Bh(x , g) = arg miny∈Q{h〈g , y − x〉+ β(x , y)}.
Examples: Euclidean distance, Entropy distance, etc.
Yu. Nesterov New primal-dual methods for functional constraints 11/19
![Page 96: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/96.jpg)
Bregman distances
Prox-function: d(·) is strongly convex on Q with parameter one:d(y) ≥ d(x) + 〈∇d(x), y − x〉+ 1
2‖y − x‖2, x , y ∈ Q.
Denote by x0 the prox-center of the set Q: x0 = arg minx∈Q
d(x).
Assume d(x0) = 0.
Bregman distance:
β(x , y) = d(y)− d(x)− 〈∇d(x), y − x〉, x , y ∈ Q.
Clearly, β(x , y) ≥ 12‖x − y‖2 for all x , y ∈ Q.
Bregman mapping: for x ∈ Q, g ∈ E ∗ and h > 0 define
Bh(x , g) = arg miny∈Q{h〈g , y − x〉+ β(x , y)}.
Examples: Euclidean distance, Entropy distance, etc.
Yu. Nesterov New primal-dual methods for functional constraints 11/19
![Page 97: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/97.jpg)
Switching subgradient methods: Primal Method
Input parameter: the step size h > 0.
Initialization : Compute the prox-center x0.
Iteration k ≥ 0 : a) Define Ik = {i ∈ {1, . . . ,m} : fi (xk) > h‖∇fi (xk)‖∗}.
b) If Ik = ∅, then compute xk+1 = Bh
(xk ,
∇f0(xk )‖∇f0(xk )‖∗
).
c) If Ik 6= ∅, then choose arbitrary ik ∈ Ik and define
hk =fik (xk )
‖∇fik (xk )‖2∗. Compute xk+1 = Bhk
(xk ,∇fik (xk)).
After t ≥ 0 iterations, define Ft = {k ∈ {0, . . . , t} : Ik = ∅}.Denote N(t) = |F(t)|. It is possible that N(t) = 0.
Yu. Nesterov New primal-dual methods for functional constraints 12/19
![Page 98: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/98.jpg)
Switching subgradient methods: Primal Method
Input parameter: the step size h > 0.
Initialization : Compute the prox-center x0.
Iteration k ≥ 0 : a) Define Ik = {i ∈ {1, . . . ,m} : fi (xk) > h‖∇fi (xk)‖∗}.
b) If Ik = ∅, then compute xk+1 = Bh
(xk ,
∇f0(xk )‖∇f0(xk )‖∗
).
c) If Ik 6= ∅, then choose arbitrary ik ∈ Ik and define
hk =fik (xk )
‖∇fik (xk )‖2∗. Compute xk+1 = Bhk
(xk ,∇fik (xk)).
After t ≥ 0 iterations, define Ft = {k ∈ {0, . . . , t} : Ik = ∅}.Denote N(t) = |F(t)|. It is possible that N(t) = 0.
Yu. Nesterov New primal-dual methods for functional constraints 12/19
![Page 99: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/99.jpg)
Switching subgradient methods: Primal Method
Input parameter: the step size h > 0.
Initialization : Compute the prox-center x0.
Iteration k ≥ 0 : a) Define Ik = {i ∈ {1, . . . ,m} : fi (xk) > h‖∇fi (xk)‖∗}.
b) If Ik = ∅, then compute xk+1 = Bh
(xk ,
∇f0(xk )‖∇f0(xk )‖∗
).
c) If Ik 6= ∅, then choose arbitrary ik ∈ Ik and define
hk =fik (xk )
‖∇fik (xk )‖2∗. Compute xk+1 = Bhk
(xk ,∇fik (xk)).
After t ≥ 0 iterations, define Ft = {k ∈ {0, . . . , t} : Ik = ∅}.Denote N(t) = |F(t)|. It is possible that N(t) = 0.
Yu. Nesterov New primal-dual methods for functional constraints 12/19
![Page 100: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/100.jpg)
Switching subgradient methods: Primal Method
Input parameter: the step size h > 0.
Initialization : Compute the prox-center x0.
Iteration k ≥ 0 :
a) Define Ik = {i ∈ {1, . . . ,m} : fi (xk) > h‖∇fi (xk)‖∗}.
b) If Ik = ∅, then compute xk+1 = Bh
(xk ,
∇f0(xk )‖∇f0(xk )‖∗
).
c) If Ik 6= ∅, then choose arbitrary ik ∈ Ik and define
hk =fik (xk )
‖∇fik (xk )‖2∗. Compute xk+1 = Bhk
(xk ,∇fik (xk)).
After t ≥ 0 iterations, define Ft = {k ∈ {0, . . . , t} : Ik = ∅}.Denote N(t) = |F(t)|. It is possible that N(t) = 0.
Yu. Nesterov New primal-dual methods for functional constraints 12/19
![Page 101: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/101.jpg)
Switching subgradient methods: Primal Method
Input parameter: the step size h > 0.
Initialization : Compute the prox-center x0.
Iteration k ≥ 0 : a) Define Ik = {i ∈ {1, . . . ,m} : fi (xk) > h‖∇fi (xk)‖∗}.
b) If Ik = ∅, then compute xk+1 = Bh
(xk ,
∇f0(xk )‖∇f0(xk )‖∗
).
c) If Ik 6= ∅, then choose arbitrary ik ∈ Ik and define
hk =fik (xk )
‖∇fik (xk )‖2∗. Compute xk+1 = Bhk
(xk ,∇fik (xk)).
After t ≥ 0 iterations, define Ft = {k ∈ {0, . . . , t} : Ik = ∅}.Denote N(t) = |F(t)|. It is possible that N(t) = 0.
Yu. Nesterov New primal-dual methods for functional constraints 12/19
![Page 102: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/102.jpg)
Switching subgradient methods: Primal Method
Input parameter: the step size h > 0.
Initialization : Compute the prox-center x0.
Iteration k ≥ 0 : a) Define Ik = {i ∈ {1, . . . ,m} : fi (xk) > h‖∇fi (xk)‖∗}.
b) If Ik = ∅, then compute xk+1 = Bh
(xk ,
∇f0(xk )‖∇f0(xk )‖∗
).
c) If Ik 6= ∅, then choose arbitrary ik ∈ Ik and define
hk =fik (xk )
‖∇fik (xk )‖2∗. Compute xk+1 = Bhk
(xk ,∇fik (xk)).
After t ≥ 0 iterations, define Ft = {k ∈ {0, . . . , t} : Ik = ∅}.Denote N(t) = |F(t)|. It is possible that N(t) = 0.
Yu. Nesterov New primal-dual methods for functional constraints 12/19
![Page 103: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/103.jpg)
Switching subgradient methods: Primal Method
Input parameter: the step size h > 0.
Initialization : Compute the prox-center x0.
Iteration k ≥ 0 : a) Define Ik = {i ∈ {1, . . . ,m} : fi (xk) > h‖∇fi (xk)‖∗}.
b) If Ik = ∅, then compute xk+1 = Bh
(xk ,
∇f0(xk )‖∇f0(xk )‖∗
).
c) If Ik 6= ∅, then choose arbitrary ik ∈ Ik and define
hk =fik (xk )
‖∇fik (xk )‖2∗.
Compute xk+1 = Bhk(xk ,∇fik (xk)).
After t ≥ 0 iterations, define Ft = {k ∈ {0, . . . , t} : Ik = ∅}.Denote N(t) = |F(t)|. It is possible that N(t) = 0.
Yu. Nesterov New primal-dual methods for functional constraints 12/19
![Page 104: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/104.jpg)
Switching subgradient methods: Primal Method
Input parameter: the step size h > 0.
Initialization : Compute the prox-center x0.
Iteration k ≥ 0 : a) Define Ik = {i ∈ {1, . . . ,m} : fi (xk) > h‖∇fi (xk)‖∗}.
b) If Ik = ∅, then compute xk+1 = Bh
(xk ,
∇f0(xk )‖∇f0(xk )‖∗
).
c) If Ik 6= ∅, then choose arbitrary ik ∈ Ik and define
hk =fik (xk )
‖∇fik (xk )‖2∗. Compute xk+1 = Bhk
(xk ,∇fik (xk)).
After t ≥ 0 iterations, define Ft = {k ∈ {0, . . . , t} : Ik = ∅}.Denote N(t) = |F(t)|. It is possible that N(t) = 0.
Yu. Nesterov New primal-dual methods for functional constraints 12/19
![Page 105: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/105.jpg)
Switching subgradient methods: Primal Method
Input parameter: the step size h > 0.
Initialization : Compute the prox-center x0.
Iteration k ≥ 0 : a) Define Ik = {i ∈ {1, . . . ,m} : fi (xk) > h‖∇fi (xk)‖∗}.
b) If Ik = ∅, then compute xk+1 = Bh
(xk ,
∇f0(xk )‖∇f0(xk )‖∗
).
c) If Ik 6= ∅, then choose arbitrary ik ∈ Ik and define
hk =fik (xk )
‖∇fik (xk )‖2∗. Compute xk+1 = Bhk
(xk ,∇fik (xk)).
After t ≥ 0 iterations, define Ft = {k ∈ {0, . . . , t} : Ik = ∅}.
Denote N(t) = |F(t)|. It is possible that N(t) = 0.
Yu. Nesterov New primal-dual methods for functional constraints 12/19
![Page 106: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/106.jpg)
Switching subgradient methods: Primal Method
Input parameter: the step size h > 0.
Initialization : Compute the prox-center x0.
Iteration k ≥ 0 : a) Define Ik = {i ∈ {1, . . . ,m} : fi (xk) > h‖∇fi (xk)‖∗}.
b) If Ik = ∅, then compute xk+1 = Bh
(xk ,
∇f0(xk )‖∇f0(xk )‖∗
).
c) If Ik 6= ∅, then choose arbitrary ik ∈ Ik and define
hk =fik (xk )
‖∇fik (xk )‖2∗. Compute xk+1 = Bhk
(xk ,∇fik (xk)).
After t ≥ 0 iterations, define Ft = {k ∈ {0, . . . , t} : Ik = ∅}.Denote N(t) = |F(t)|.
It is possible that N(t) = 0.
Yu. Nesterov New primal-dual methods for functional constraints 12/19
![Page 107: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/107.jpg)
Switching subgradient methods: Primal Method
Input parameter: the step size h > 0.
Initialization : Compute the prox-center x0.
Iteration k ≥ 0 : a) Define Ik = {i ∈ {1, . . . ,m} : fi (xk) > h‖∇fi (xk)‖∗}.
b) If Ik = ∅, then compute xk+1 = Bh
(xk ,
∇f0(xk )‖∇f0(xk )‖∗
).
c) If Ik 6= ∅, then choose arbitrary ik ∈ Ik and define
hk =fik (xk )
‖∇fik (xk )‖2∗. Compute xk+1 = Bhk
(xk ,∇fik (xk)).
After t ≥ 0 iterations, define Ft = {k ∈ {0, . . . , t} : Ik = ∅}.Denote N(t) = |F(t)|. It is possible that N(t) = 0.
Yu. Nesterov New primal-dual methods for functional constraints 12/19
![Page 108: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/108.jpg)
Finding the dual multipliers
if N(t) > 0, define the dual multipliers as follows:
λ(0)t = h
∑k∈Ft
1‖∇f0(xk )‖∗ ,
λ(i)t = 1
λ(0)t
∑k∈Ai (t)
hk , i = 1, . . . ,m,
where Ai (t) = {k ∈ {0, . . . , t} : ik = i}, 0 ≤ i ≤ m.
Denote St =∑
k∈Ft
1‖∇f0(xk )‖∗ . If Ft = ∅, then we define St = 0.
For proving convergence of the switching strategy, we find anupper bound for the gap
δt = 1St
∑k∈F(t)
f0(xk )‖∇f0(xk )‖∗ − φ(λt),
assuming that N(t) > 0.
Yu. Nesterov New primal-dual methods for functional constraints 13/19
![Page 109: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/109.jpg)
Finding the dual multipliers
if N(t) > 0, define the dual multipliers as follows:
λ(0)t = h
∑k∈Ft
1‖∇f0(xk )‖∗ ,
λ(i)t = 1
λ(0)t
∑k∈Ai (t)
hk , i = 1, . . . ,m,
where Ai (t) = {k ∈ {0, . . . , t} : ik = i}, 0 ≤ i ≤ m.
Denote St =∑
k∈Ft
1‖∇f0(xk )‖∗ . If Ft = ∅, then we define St = 0.
For proving convergence of the switching strategy, we find anupper bound for the gap
δt = 1St
∑k∈F(t)
f0(xk )‖∇f0(xk )‖∗ − φ(λt),
assuming that N(t) > 0.
Yu. Nesterov New primal-dual methods for functional constraints 13/19
![Page 110: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/110.jpg)
Finding the dual multipliers
if N(t) > 0, define the dual multipliers as follows:
λ(0)t = h
∑k∈Ft
1‖∇f0(xk )‖∗ ,
λ(i)t = 1
λ(0)t
∑k∈Ai (t)
hk , i = 1, . . . ,m,
where Ai (t) = {k ∈ {0, . . . , t} : ik = i}, 0 ≤ i ≤ m.
Denote St =∑
k∈Ft
1‖∇f0(xk )‖∗ . If Ft = ∅, then we define St = 0.
For proving convergence of the switching strategy, we find anupper bound for the gap
δt = 1St
∑k∈F(t)
f0(xk )‖∇f0(xk )‖∗ − φ(λt),
assuming that N(t) > 0.
Yu. Nesterov New primal-dual methods for functional constraints 13/19
![Page 111: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/111.jpg)
Finding the dual multipliers
if N(t) > 0, define the dual multipliers as follows:
λ(0)t = h
∑k∈Ft
1‖∇f0(xk )‖∗ ,
λ(i)t = 1
λ(0)t
∑k∈Ai (t)
hk , i = 1, . . . ,m,
where Ai (t) = {k ∈ {0, . . . , t} : ik = i}, 0 ≤ i ≤ m.
Denote St =∑
k∈Ft
1‖∇f0(xk )‖∗ . If Ft = ∅, then we define St = 0.
For proving convergence of the switching strategy, we find anupper bound for the gap
δt = 1St
∑k∈F(t)
f0(xk )‖∇f0(xk )‖∗ − φ(λt),
assuming that N(t) > 0.
Yu. Nesterov New primal-dual methods for functional constraints 13/19
![Page 112: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/112.jpg)
Finding the dual multipliers
if N(t) > 0, define the dual multipliers as follows:
λ(0)t = h
∑k∈Ft
1‖∇f0(xk )‖∗ ,
λ(i)t = 1
λ(0)t
∑k∈Ai (t)
hk , i = 1, . . . ,m,
where Ai (t) = {k ∈ {0, . . . , t} : ik = i}, 0 ≤ i ≤ m.
Denote St =∑
k∈Ft
1‖∇f0(xk )‖∗ . If Ft = ∅, then we define St = 0.
For proving convergence of the switching strategy, we find anupper bound for the gap
δt = 1St
∑k∈F(t)
f0(xk )‖∇f0(xk )‖∗ − φ(λt),
assuming that N(t) > 0.
Yu. Nesterov New primal-dual methods for functional constraints 13/19
![Page 113: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/113.jpg)
Finding the dual multipliers
if N(t) > 0, define the dual multipliers as follows:
λ(0)t = h
∑k∈Ft
1‖∇f0(xk )‖∗ ,
λ(i)t = 1
λ(0)t
∑k∈Ai (t)
hk , i = 1, . . . ,m,
where Ai (t) = {k ∈ {0, . . . , t} : ik = i}, 0 ≤ i ≤ m.
Denote St =∑
k∈Ft
1‖∇f0(xk )‖∗ .
If Ft = ∅, then we define St = 0.
For proving convergence of the switching strategy, we find anupper bound for the gap
δt = 1St
∑k∈F(t)
f0(xk )‖∇f0(xk )‖∗ − φ(λt),
assuming that N(t) > 0.
Yu. Nesterov New primal-dual methods for functional constraints 13/19
![Page 114: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/114.jpg)
Finding the dual multipliers
if N(t) > 0, define the dual multipliers as follows:
λ(0)t = h
∑k∈Ft
1‖∇f0(xk )‖∗ ,
λ(i)t = 1
λ(0)t
∑k∈Ai (t)
hk , i = 1, . . . ,m,
where Ai (t) = {k ∈ {0, . . . , t} : ik = i}, 0 ≤ i ≤ m.
Denote St =∑
k∈Ft
1‖∇f0(xk )‖∗ . If Ft = ∅, then we define St = 0.
For proving convergence of the switching strategy, we find anupper bound for the gap
δt = 1St
∑k∈F(t)
f0(xk )‖∇f0(xk )‖∗ − φ(λt),
assuming that N(t) > 0.
Yu. Nesterov New primal-dual methods for functional constraints 13/19
![Page 115: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/115.jpg)
Finding the dual multipliers
if N(t) > 0, define the dual multipliers as follows:
λ(0)t = h
∑k∈Ft
1‖∇f0(xk )‖∗ ,
λ(i)t = 1
λ(0)t
∑k∈Ai (t)
hk , i = 1, . . . ,m,
where Ai (t) = {k ∈ {0, . . . , t} : ik = i}, 0 ≤ i ≤ m.
Denote St =∑
k∈Ft
1‖∇f0(xk )‖∗ . If Ft = ∅, then we define St = 0.
For proving convergence of the switching strategy,
we find anupper bound for the gap
δt = 1St
∑k∈F(t)
f0(xk )‖∇f0(xk )‖∗ − φ(λt),
assuming that N(t) > 0.
Yu. Nesterov New primal-dual methods for functional constraints 13/19
![Page 116: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/116.jpg)
Finding the dual multipliers
if N(t) > 0, define the dual multipliers as follows:
λ(0)t = h
∑k∈Ft
1‖∇f0(xk )‖∗ ,
λ(i)t = 1
λ(0)t
∑k∈Ai (t)
hk , i = 1, . . . ,m,
where Ai (t) = {k ∈ {0, . . . , t} : ik = i}, 0 ≤ i ≤ m.
Denote St =∑
k∈Ft
1‖∇f0(xk )‖∗ . If Ft = ∅, then we define St = 0.
For proving convergence of the switching strategy, we find anupper bound for the gap
δt = 1St
∑k∈F(t)
f0(xk )‖∇f0(xk )‖∗ − φ(λt),
assuming that N(t) > 0.
Yu. Nesterov New primal-dual methods for functional constraints 13/19
![Page 117: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/117.jpg)
Finding the dual multipliers
if N(t) > 0, define the dual multipliers as follows:
λ(0)t = h
∑k∈Ft
1‖∇f0(xk )‖∗ ,
λ(i)t = 1
λ(0)t
∑k∈Ai (t)
hk , i = 1, . . . ,m,
where Ai (t) = {k ∈ {0, . . . , t} : ik = i}, 0 ≤ i ≤ m.
Denote St =∑
k∈Ft
1‖∇f0(xk )‖∗ . If Ft = ∅, then we define St = 0.
For proving convergence of the switching strategy, we find anupper bound for the gap
δt = 1St
∑k∈F(t)
f0(xk )‖∇f0(xk )‖∗ − φ(λt),
assuming that N(t) > 0.
Yu. Nesterov New primal-dual methods for functional constraints 13/19
![Page 118: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/118.jpg)
Convergence result
Main inequality:
λ(0)t δt ≤ r0(x) + 1
2N(t)h2 − 12(t − N(t))h2 = r0(x)− 1
2 th2 + N(t)h2.
Denote D = maxx∈Q
r0(x).
Theorem. If the number t ≥ 2h2 D, then F(t) 6= ∅.
In this case δt ≤ Mh and max1≤i≤m
fi (xk) ≤ Mh, k ∈ F(t)
where M = max0≤k≤t
max0≤i≤m
‖∇fi (xk)‖∗.
Proof: If F(t) = ∅, then N(t) = 0. Consequently, λ(0)t = 0. This
is impossible for t big enough.
Finally, λ(0)t ≥ h
M N(t). Therefore, if t is big enough, then
δt ≤ N(t)h2
λ(0)t
≤ Mh. �
Yu. Nesterov New primal-dual methods for functional constraints 14/19
![Page 119: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/119.jpg)
Convergence result
Main inequality:
λ(0)t δt
≤ r0(x) + 12N(t)h2 − 1
2(t − N(t))h2 = r0(x)− 12 th2 + N(t)h2.
Denote D = maxx∈Q
r0(x).
Theorem. If the number t ≥ 2h2 D, then F(t) 6= ∅.
In this case δt ≤ Mh and max1≤i≤m
fi (xk) ≤ Mh, k ∈ F(t)
where M = max0≤k≤t
max0≤i≤m
‖∇fi (xk)‖∗.
Proof: If F(t) = ∅, then N(t) = 0. Consequently, λ(0)t = 0. This
is impossible for t big enough.
Finally, λ(0)t ≥ h
M N(t). Therefore, if t is big enough, then
δt ≤ N(t)h2
λ(0)t
≤ Mh. �
Yu. Nesterov New primal-dual methods for functional constraints 14/19
![Page 120: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/120.jpg)
Convergence result
Main inequality:
λ(0)t δt ≤ r0(x) + 1
2N(t)h2 − 12(t − N(t))h2
= r0(x)− 12 th2 + N(t)h2.
Denote D = maxx∈Q
r0(x).
Theorem. If the number t ≥ 2h2 D, then F(t) 6= ∅.
In this case δt ≤ Mh and max1≤i≤m
fi (xk) ≤ Mh, k ∈ F(t)
where M = max0≤k≤t
max0≤i≤m
‖∇fi (xk)‖∗.
Proof: If F(t) = ∅, then N(t) = 0. Consequently, λ(0)t = 0. This
is impossible for t big enough.
Finally, λ(0)t ≥ h
M N(t). Therefore, if t is big enough, then
δt ≤ N(t)h2
λ(0)t
≤ Mh. �
Yu. Nesterov New primal-dual methods for functional constraints 14/19
![Page 121: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/121.jpg)
Convergence result
Main inequality:
λ(0)t δt ≤ r0(x) + 1
2N(t)h2 − 12(t − N(t))h2 = r0(x)− 1
2 th2 + N(t)h2.
Denote D = maxx∈Q
r0(x).
Theorem. If the number t ≥ 2h2 D, then F(t) 6= ∅.
In this case δt ≤ Mh and max1≤i≤m
fi (xk) ≤ Mh, k ∈ F(t)
where M = max0≤k≤t
max0≤i≤m
‖∇fi (xk)‖∗.
Proof: If F(t) = ∅, then N(t) = 0. Consequently, λ(0)t = 0. This
is impossible for t big enough.
Finally, λ(0)t ≥ h
M N(t). Therefore, if t is big enough, then
δt ≤ N(t)h2
λ(0)t
≤ Mh. �
Yu. Nesterov New primal-dual methods for functional constraints 14/19
![Page 122: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/122.jpg)
Convergence result
Main inequality:
λ(0)t δt ≤ r0(x) + 1
2N(t)h2 − 12(t − N(t))h2 = r0(x)− 1
2 th2 + N(t)h2.
Denote D = maxx∈Q
r0(x).
Theorem. If the number t ≥ 2h2 D, then F(t) 6= ∅.
In this case δt ≤ Mh and max1≤i≤m
fi (xk) ≤ Mh, k ∈ F(t)
where M = max0≤k≤t
max0≤i≤m
‖∇fi (xk)‖∗.
Proof: If F(t) = ∅, then N(t) = 0. Consequently, λ(0)t = 0. This
is impossible for t big enough.
Finally, λ(0)t ≥ h
M N(t). Therefore, if t is big enough, then
δt ≤ N(t)h2
λ(0)t
≤ Mh. �
Yu. Nesterov New primal-dual methods for functional constraints 14/19
![Page 123: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/123.jpg)
Convergence result
Main inequality:
λ(0)t δt ≤ r0(x) + 1
2N(t)h2 − 12(t − N(t))h2 = r0(x)− 1
2 th2 + N(t)h2.
Denote D = maxx∈Q
r0(x).
Theorem. If the number t ≥ 2h2 D,
then F(t) 6= ∅.In this case δt ≤ Mh and max
1≤i≤mfi (xk) ≤ Mh, k ∈ F(t)
where M = max0≤k≤t
max0≤i≤m
‖∇fi (xk)‖∗.
Proof: If F(t) = ∅, then N(t) = 0. Consequently, λ(0)t = 0. This
is impossible for t big enough.
Finally, λ(0)t ≥ h
M N(t). Therefore, if t is big enough, then
δt ≤ N(t)h2
λ(0)t
≤ Mh. �
Yu. Nesterov New primal-dual methods for functional constraints 14/19
![Page 124: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/124.jpg)
Convergence result
Main inequality:
λ(0)t δt ≤ r0(x) + 1
2N(t)h2 − 12(t − N(t))h2 = r0(x)− 1
2 th2 + N(t)h2.
Denote D = maxx∈Q
r0(x).
Theorem. If the number t ≥ 2h2 D, then F(t) 6= ∅.
In this case δt ≤ Mh and max1≤i≤m
fi (xk) ≤ Mh, k ∈ F(t)
where M = max0≤k≤t
max0≤i≤m
‖∇fi (xk)‖∗.
Proof: If F(t) = ∅, then N(t) = 0. Consequently, λ(0)t = 0. This
is impossible for t big enough.
Finally, λ(0)t ≥ h
M N(t). Therefore, if t is big enough, then
δt ≤ N(t)h2
λ(0)t
≤ Mh. �
Yu. Nesterov New primal-dual methods for functional constraints 14/19
![Page 125: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/125.jpg)
Convergence result
Main inequality:
λ(0)t δt ≤ r0(x) + 1
2N(t)h2 − 12(t − N(t))h2 = r0(x)− 1
2 th2 + N(t)h2.
Denote D = maxx∈Q
r0(x).
Theorem. If the number t ≥ 2h2 D, then F(t) 6= ∅.
In this case δt ≤ Mh
and max1≤i≤m
fi (xk) ≤ Mh, k ∈ F(t)
where M = max0≤k≤t
max0≤i≤m
‖∇fi (xk)‖∗.
Proof: If F(t) = ∅, then N(t) = 0. Consequently, λ(0)t = 0. This
is impossible for t big enough.
Finally, λ(0)t ≥ h
M N(t). Therefore, if t is big enough, then
δt ≤ N(t)h2
λ(0)t
≤ Mh. �
Yu. Nesterov New primal-dual methods for functional constraints 14/19
![Page 126: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/126.jpg)
Convergence result
Main inequality:
λ(0)t δt ≤ r0(x) + 1
2N(t)h2 − 12(t − N(t))h2 = r0(x)− 1
2 th2 + N(t)h2.
Denote D = maxx∈Q
r0(x).
Theorem. If the number t ≥ 2h2 D, then F(t) 6= ∅.
In this case δt ≤ Mh and max1≤i≤m
fi (xk) ≤ Mh, k ∈ F(t)
where M = max0≤k≤t
max0≤i≤m
‖∇fi (xk)‖∗.
Proof: If F(t) = ∅, then N(t) = 0. Consequently, λ(0)t = 0. This
is impossible for t big enough.
Finally, λ(0)t ≥ h
M N(t). Therefore, if t is big enough, then
δt ≤ N(t)h2
λ(0)t
≤ Mh. �
Yu. Nesterov New primal-dual methods for functional constraints 14/19
![Page 127: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/127.jpg)
Convergence result
Main inequality:
λ(0)t δt ≤ r0(x) + 1
2N(t)h2 − 12(t − N(t))h2 = r0(x)− 1
2 th2 + N(t)h2.
Denote D = maxx∈Q
r0(x).
Theorem. If the number t ≥ 2h2 D, then F(t) 6= ∅.
In this case δt ≤ Mh and max1≤i≤m
fi (xk) ≤ Mh, k ∈ F(t)
where M = max0≤k≤t
max0≤i≤m
‖∇fi (xk)‖∗.
Proof:
If F(t) = ∅, then N(t) = 0. Consequently, λ(0)t = 0. This
is impossible for t big enough.
Finally, λ(0)t ≥ h
M N(t). Therefore, if t is big enough, then
δt ≤ N(t)h2
λ(0)t
≤ Mh. �
Yu. Nesterov New primal-dual methods for functional constraints 14/19
![Page 128: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/128.jpg)
Convergence result
Main inequality:
λ(0)t δt ≤ r0(x) + 1
2N(t)h2 − 12(t − N(t))h2 = r0(x)− 1
2 th2 + N(t)h2.
Denote D = maxx∈Q
r0(x).
Theorem. If the number t ≥ 2h2 D, then F(t) 6= ∅.
In this case δt ≤ Mh and max1≤i≤m
fi (xk) ≤ Mh, k ∈ F(t)
where M = max0≤k≤t
max0≤i≤m
‖∇fi (xk)‖∗.
Proof: If F(t) = ∅, then N(t) = 0.
Consequently, λ(0)t = 0. This
is impossible for t big enough.
Finally, λ(0)t ≥ h
M N(t). Therefore, if t is big enough, then
δt ≤ N(t)h2
λ(0)t
≤ Mh. �
Yu. Nesterov New primal-dual methods for functional constraints 14/19
![Page 129: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/129.jpg)
Convergence result
Main inequality:
λ(0)t δt ≤ r0(x) + 1
2N(t)h2 − 12(t − N(t))h2 = r0(x)− 1
2 th2 + N(t)h2.
Denote D = maxx∈Q
r0(x).
Theorem. If the number t ≥ 2h2 D, then F(t) 6= ∅.
In this case δt ≤ Mh and max1≤i≤m
fi (xk) ≤ Mh, k ∈ F(t)
where M = max0≤k≤t
max0≤i≤m
‖∇fi (xk)‖∗.
Proof: If F(t) = ∅, then N(t) = 0. Consequently, λ(0)t = 0.
Thisis impossible for t big enough.
Finally, λ(0)t ≥ h
M N(t). Therefore, if t is big enough, then
δt ≤ N(t)h2
λ(0)t
≤ Mh. �
Yu. Nesterov New primal-dual methods for functional constraints 14/19
![Page 130: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/130.jpg)
Convergence result
Main inequality:
λ(0)t δt ≤ r0(x) + 1
2N(t)h2 − 12(t − N(t))h2 = r0(x)− 1
2 th2 + N(t)h2.
Denote D = maxx∈Q
r0(x).
Theorem. If the number t ≥ 2h2 D, then F(t) 6= ∅.
In this case δt ≤ Mh and max1≤i≤m
fi (xk) ≤ Mh, k ∈ F(t)
where M = max0≤k≤t
max0≤i≤m
‖∇fi (xk)‖∗.
Proof: If F(t) = ∅, then N(t) = 0. Consequently, λ(0)t = 0. This
is impossible for t big enough.
Finally, λ(0)t ≥ h
M N(t). Therefore, if t is big enough, then
δt ≤ N(t)h2
λ(0)t
≤ Mh. �
Yu. Nesterov New primal-dual methods for functional constraints 14/19
![Page 131: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/131.jpg)
Convergence result
Main inequality:
λ(0)t δt ≤ r0(x) + 1
2N(t)h2 − 12(t − N(t))h2 = r0(x)− 1
2 th2 + N(t)h2.
Denote D = maxx∈Q
r0(x).
Theorem. If the number t ≥ 2h2 D, then F(t) 6= ∅.
In this case δt ≤ Mh and max1≤i≤m
fi (xk) ≤ Mh, k ∈ F(t)
where M = max0≤k≤t
max0≤i≤m
‖∇fi (xk)‖∗.
Proof: If F(t) = ∅, then N(t) = 0. Consequently, λ(0)t = 0. This
is impossible for t big enough.
Finally, λ(0)t ≥ h
M N(t).
Therefore, if t is big enough, then
δt ≤ N(t)h2
λ(0)t
≤ Mh. �
Yu. Nesterov New primal-dual methods for functional constraints 14/19
![Page 132: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/132.jpg)
Convergence result
Main inequality:
λ(0)t δt ≤ r0(x) + 1
2N(t)h2 − 12(t − N(t))h2 = r0(x)− 1
2 th2 + N(t)h2.
Denote D = maxx∈Q
r0(x).
Theorem. If the number t ≥ 2h2 D, then F(t) 6= ∅.
In this case δt ≤ Mh and max1≤i≤m
fi (xk) ≤ Mh, k ∈ F(t)
where M = max0≤k≤t
max0≤i≤m
‖∇fi (xk)‖∗.
Proof: If F(t) = ∅, then N(t) = 0. Consequently, λ(0)t = 0. This
is impossible for t big enough.
Finally, λ(0)t ≥ h
M N(t). Therefore, if t is big enough,
then
δt ≤ N(t)h2
λ(0)t
≤ Mh. �
Yu. Nesterov New primal-dual methods for functional constraints 14/19
![Page 133: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/133.jpg)
Convergence result
Main inequality:
λ(0)t δt ≤ r0(x) + 1
2N(t)h2 − 12(t − N(t))h2 = r0(x)− 1
2 th2 + N(t)h2.
Denote D = maxx∈Q
r0(x).
Theorem. If the number t ≥ 2h2 D, then F(t) 6= ∅.
In this case δt ≤ Mh and max1≤i≤m
fi (xk) ≤ Mh, k ∈ F(t)
where M = max0≤k≤t
max0≤i≤m
‖∇fi (xk)‖∗.
Proof: If F(t) = ∅, then N(t) = 0. Consequently, λ(0)t = 0. This
is impossible for t big enough.
Finally, λ(0)t ≥ h
M N(t). Therefore, if t is big enough, then
δt ≤ N(t)h2
λ(0)t
≤ Mh. �
Yu. Nesterov New primal-dual methods for functional constraints 14/19
![Page 134: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/134.jpg)
Convergence result
Main inequality:
λ(0)t δt ≤ r0(x) + 1
2N(t)h2 − 12(t − N(t))h2 = r0(x)− 1
2 th2 + N(t)h2.
Denote D = maxx∈Q
r0(x).
Theorem. If the number t ≥ 2h2 D, then F(t) 6= ∅.
In this case δt ≤ Mh and max1≤i≤m
fi (xk) ≤ Mh, k ∈ F(t)
where M = max0≤k≤t
max0≤i≤m
‖∇fi (xk)‖∗.
Proof: If F(t) = ∅, then N(t) = 0. Consequently, λ(0)t = 0. This
is impossible for t big enough.
Finally, λ(0)t ≥ h
M N(t). Therefore, if t is big enough, then
δt ≤ N(t)h2
λ(0)t
≤ Mh. �
Yu. Nesterov New primal-dual methods for functional constraints 14/19
![Page 135: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/135.jpg)
Dual subgradient method
Define the averaging coefficients {ak}k≥0, and nondecreasing
scaling coefficients {βk}k≥0. Denote At =t∑
k=0
ak .
Initialization : Define `0(x) ≡ 0, x ∈ Q.
Iteration k ≥ 0 : a) Compute xk = arg minx∈Q{`k(x) + βkd(x)}.
b) Define Ik = {i ∈ [1 : m] : f (i)(xk) ≥ ε}.
c) If Ik = ∅, then `k+1(x) = `k(x) + ak [f (0)(xk) + 〈∇f (0)(xk), x − xk〉].
d) If Ik 6= ∅, then choose arbitrary ik ∈ Ik and define
`k+1(x) = `k(x) + ak [f (ik )(xk) + 〈∇f (ik )(xk), x − xk〉].
Yu. Nesterov New primal-dual methods for functional constraints 15/19
![Page 136: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/136.jpg)
Dual subgradient method
Define the averaging coefficients {ak}k≥0, and nondecreasing
scaling coefficients {βk}k≥0. Denote At =t∑
k=0
ak .
Initialization : Define `0(x) ≡ 0, x ∈ Q.
Iteration k ≥ 0 : a) Compute xk = arg minx∈Q{`k(x) + βkd(x)}.
b) Define Ik = {i ∈ [1 : m] : f (i)(xk) ≥ ε}.
c) If Ik = ∅, then `k+1(x) = `k(x) + ak [f (0)(xk) + 〈∇f (0)(xk), x − xk〉].
d) If Ik 6= ∅, then choose arbitrary ik ∈ Ik and define
`k+1(x) = `k(x) + ak [f (ik )(xk) + 〈∇f (ik )(xk), x − xk〉].
Yu. Nesterov New primal-dual methods for functional constraints 15/19
![Page 137: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/137.jpg)
Dual subgradient method
Define the averaging coefficients {ak}k≥0, and nondecreasing
scaling coefficients {βk}k≥0. Denote At =t∑
k=0
ak .
Initialization : Define `0(x) ≡ 0, x ∈ Q.
Iteration k ≥ 0 : a) Compute xk = arg minx∈Q{`k(x) + βkd(x)}.
b) Define Ik = {i ∈ [1 : m] : f (i)(xk) ≥ ε}.
c) If Ik = ∅, then `k+1(x) = `k(x) + ak [f (0)(xk) + 〈∇f (0)(xk), x − xk〉].
d) If Ik 6= ∅, then choose arbitrary ik ∈ Ik and define
`k+1(x) = `k(x) + ak [f (ik )(xk) + 〈∇f (ik )(xk), x − xk〉].
Yu. Nesterov New primal-dual methods for functional constraints 15/19
![Page 138: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/138.jpg)
Dual subgradient method
Define the averaging coefficients {ak}k≥0, and nondecreasing
scaling coefficients {βk}k≥0. Denote At =t∑
k=0
ak .
Initialization : Define `0(x) ≡ 0, x ∈ Q.
Iteration k ≥ 0 : a) Compute xk = arg minx∈Q{`k(x) + βkd(x)}.
b) Define Ik = {i ∈ [1 : m] : f (i)(xk) ≥ ε}.
c) If Ik = ∅, then `k+1(x) = `k(x) + ak [f (0)(xk) + 〈∇f (0)(xk), x − xk〉].
d) If Ik 6= ∅, then choose arbitrary ik ∈ Ik and define
`k+1(x) = `k(x) + ak [f (ik )(xk) + 〈∇f (ik )(xk), x − xk〉].
Yu. Nesterov New primal-dual methods for functional constraints 15/19
![Page 139: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/139.jpg)
Dual subgradient method
Define the averaging coefficients {ak}k≥0, and nondecreasing
scaling coefficients {βk}k≥0. Denote At =t∑
k=0
ak .
Initialization : Define `0(x) ≡ 0, x ∈ Q.
Iteration k ≥ 0 : a) Compute xk = arg minx∈Q{`k(x) + βkd(x)}.
b) Define Ik = {i ∈ [1 : m] : f (i)(xk) ≥ ε}.
c) If Ik = ∅, then `k+1(x) = `k(x) + ak [f (0)(xk) + 〈∇f (0)(xk), x − xk〉].
d) If Ik 6= ∅, then choose arbitrary ik ∈ Ik and define
`k+1(x) = `k(x) + ak [f (ik )(xk) + 〈∇f (ik )(xk), x − xk〉].
Yu. Nesterov New primal-dual methods for functional constraints 15/19
![Page 140: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/140.jpg)
Dual subgradient method
Define the averaging coefficients {ak}k≥0, and nondecreasing
scaling coefficients {βk}k≥0. Denote At =t∑
k=0
ak .
Initialization : Define `0(x) ≡ 0, x ∈ Q.
Iteration k ≥ 0 : a) Compute xk = arg minx∈Q{`k(x) + βkd(x)}.
b) Define Ik = {i ∈ [1 : m] : f (i)(xk) ≥ ε}.
c) If Ik = ∅, then `k+1(x) = `k(x) + ak [f (0)(xk) + 〈∇f (0)(xk), x − xk〉].
d) If Ik 6= ∅, then choose arbitrary ik ∈ Ik and define
`k+1(x) = `k(x) + ak [f (ik )(xk) + 〈∇f (ik )(xk), x − xk〉].
Yu. Nesterov New primal-dual methods for functional constraints 15/19
![Page 141: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/141.jpg)
Dual subgradient method
Define the averaging coefficients {ak}k≥0, and nondecreasing
scaling coefficients {βk}k≥0. Denote At =t∑
k=0
ak .
Initialization : Define `0(x) ≡ 0, x ∈ Q.
Iteration k ≥ 0 : a) Compute xk = arg minx∈Q{`k(x) + βkd(x)}.
b) Define Ik = {i ∈ [1 : m] : f (i)(xk) ≥ ε}.
c) If Ik = ∅, then `k+1(x) = `k(x) + ak [f (0)(xk) + 〈∇f (0)(xk), x − xk〉].
d) If Ik 6= ∅, then choose arbitrary ik ∈ Ik and define
`k+1(x) = `k(x) + ak [f (ik )(xk) + 〈∇f (ik )(xk), x − xk〉].
Yu. Nesterov New primal-dual methods for functional constraints 15/19
![Page 142: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/142.jpg)
Convergence result
Define A0(t) = {k ∈ [0 : t] : Ik = ∅}, N(t) = |A0(t)|, and
σt =∑
k∈A0(t)
ak , λ(i)t = 1
σt
∑k∈Ai (t)
ak , i = 1, . . . ,m,
where Ai (t) = {k ∈ [0 : t] : ik = i}, 1 ≤ i ≤ m.
If N(t) > 0, then define the gap δt = 1σt
∑k∈F(t)
ak f (0)(xk)− φ(λt).
Denote D = maxx∈Q
d(x).
T. Let all subgradients be bounded by M. Then for any t ≥ 0
σt · (δt − ε) + Atε ≤ βt+1D + 12M2
t∑k=0
a2kβk
.
If Atε > βt+1D + 12M2
t∑k=0
a2kβk
, then λ(0)t > 0 and δt ≤ ε.
Example: at ≡ 1, βt ≈√
t ⇒ t ≈ O(
1ε2
).
Yu. Nesterov New primal-dual methods for functional constraints 16/19
![Page 143: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/143.jpg)
Convergence result
Define A0(t) = {k ∈ [0 : t] : Ik = ∅}, N(t) = |A0(t)|, and
σt =∑
k∈A0(t)
ak , λ(i)t = 1
σt
∑k∈Ai (t)
ak , i = 1, . . . ,m,
where Ai (t) = {k ∈ [0 : t] : ik = i}, 1 ≤ i ≤ m.
If N(t) > 0, then define the gap δt = 1σt
∑k∈F(t)
ak f (0)(xk)− φ(λt).
Denote D = maxx∈Q
d(x).
T. Let all subgradients be bounded by M. Then for any t ≥ 0
σt · (δt − ε) + Atε ≤ βt+1D + 12M2
t∑k=0
a2kβk
.
If Atε > βt+1D + 12M2
t∑k=0
a2kβk
, then λ(0)t > 0 and δt ≤ ε.
Example: at ≡ 1, βt ≈√
t ⇒ t ≈ O(
1ε2
).
Yu. Nesterov New primal-dual methods for functional constraints 16/19
![Page 144: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/144.jpg)
Convergence result
Define A0(t) = {k ∈ [0 : t] : Ik = ∅}, N(t) = |A0(t)|, and
σt =∑
k∈A0(t)
ak , λ(i)t = 1
σt
∑k∈Ai (t)
ak , i = 1, . . . ,m,
where Ai (t) = {k ∈ [0 : t] : ik = i}, 1 ≤ i ≤ m.
If N(t) > 0, then define the gap δt = 1σt
∑k∈F(t)
ak f (0)(xk)− φ(λt).
Denote D = maxx∈Q
d(x).
T. Let all subgradients be bounded by M. Then for any t ≥ 0
σt · (δt − ε) + Atε ≤ βt+1D + 12M2
t∑k=0
a2kβk
.
If Atε > βt+1D + 12M2
t∑k=0
a2kβk
, then λ(0)t > 0 and δt ≤ ε.
Example: at ≡ 1, βt ≈√
t ⇒ t ≈ O(
1ε2
).
Yu. Nesterov New primal-dual methods for functional constraints 16/19
![Page 145: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/145.jpg)
Convergence result
Define A0(t) = {k ∈ [0 : t] : Ik = ∅}, N(t) = |A0(t)|, and
σt =∑
k∈A0(t)
ak , λ(i)t = 1
σt
∑k∈Ai (t)
ak , i = 1, . . . ,m,
where Ai (t) = {k ∈ [0 : t] : ik = i}, 1 ≤ i ≤ m.
If N(t) > 0, then define the gap δt = 1σt
∑k∈F(t)
ak f (0)(xk)− φ(λt).
Denote D = maxx∈Q
d(x).
T. Let all subgradients be bounded by M. Then for any t ≥ 0
σt · (δt − ε) + Atε ≤ βt+1D + 12M2
t∑k=0
a2kβk
.
If Atε > βt+1D + 12M2
t∑k=0
a2kβk
, then λ(0)t > 0 and δt ≤ ε.
Example: at ≡ 1, βt ≈√
t ⇒ t ≈ O(
1ε2
).
Yu. Nesterov New primal-dual methods for functional constraints 16/19
![Page 146: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/146.jpg)
Convergence result
Define A0(t) = {k ∈ [0 : t] : Ik = ∅}, N(t) = |A0(t)|, and
σt =∑
k∈A0(t)
ak , λ(i)t = 1
σt
∑k∈Ai (t)
ak , i = 1, . . . ,m,
where Ai (t) = {k ∈ [0 : t] : ik = i}, 1 ≤ i ≤ m.
If N(t) > 0, then define the gap δt = 1σt
∑k∈F(t)
ak f (0)(xk)− φ(λt).
Denote D = maxx∈Q
d(x).
T. Let all subgradients be bounded by M. Then for any t ≥ 0
σt · (δt − ε) + Atε ≤ βt+1D + 12M2
t∑k=0
a2kβk
.
If Atε > βt+1D + 12M2
t∑k=0
a2kβk
, then λ(0)t > 0 and δt ≤ ε.
Example: at ≡ 1, βt ≈√
t ⇒ t ≈ O(
1ε2
).
Yu. Nesterov New primal-dual methods for functional constraints 16/19
![Page 147: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/147.jpg)
Convergence result
Define A0(t) = {k ∈ [0 : t] : Ik = ∅}, N(t) = |A0(t)|, and
σt =∑
k∈A0(t)
ak , λ(i)t = 1
σt
∑k∈Ai (t)
ak , i = 1, . . . ,m,
where Ai (t) = {k ∈ [0 : t] : ik = i}, 1 ≤ i ≤ m.
If N(t) > 0, then define the gap δt = 1σt
∑k∈F(t)
ak f (0)(xk)− φ(λt).
Denote D = maxx∈Q
d(x).
T. Let all subgradients be bounded by M. Then for any t ≥ 0
σt · (δt − ε) + Atε ≤ βt+1D + 12M2
t∑k=0
a2kβk
.
If Atε > βt+1D + 12M2
t∑k=0
a2kβk
, then λ(0)t > 0 and δt ≤ ε.
Example: at ≡ 1, βt ≈√
t ⇒ t ≈ O(
1ε2
).
Yu. Nesterov New primal-dual methods for functional constraints 16/19
![Page 148: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/148.jpg)
Convergence result
Define A0(t) = {k ∈ [0 : t] : Ik = ∅}, N(t) = |A0(t)|, and
σt =∑
k∈A0(t)
ak , λ(i)t = 1
σt
∑k∈Ai (t)
ak , i = 1, . . . ,m,
where Ai (t) = {k ∈ [0 : t] : ik = i}, 1 ≤ i ≤ m.
If N(t) > 0, then define the gap δt = 1σt
∑k∈F(t)
ak f (0)(xk)− φ(λt).
Denote D = maxx∈Q
d(x).
T. Let all subgradients be bounded by M. Then for any t ≥ 0
σt · (δt − ε) + Atε ≤ βt+1D + 12M2
t∑k=0
a2kβk
.
If Atε > βt+1D + 12M2
t∑k=0
a2kβk
, then λ(0)t > 0 and δt ≤ ε.
Example: at ≡ 1, βt ≈√
t ⇒ t ≈ O(
1ε2
).
Yu. Nesterov New primal-dual methods for functional constraints 16/19
![Page 149: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/149.jpg)
Quasi-monotone method
Initialization : Define `0(x) ≡ 0, x0 = ./, and σ0 = 0.Iteration t ≥ 0 :
a) Set vt = arg minx∈Q{`t(x) + βtd(x)}, It
def= {i ∈ [1 : m] : f (i)(vt) ≥ ε}.
b) If It 6= ∅, then set xt+1 = xt , σt+1 = σt , and choose arbitrary it ∈ It .
Update `t+1(x) = `t(x) + at+1[f (it)(vt) + 〈∇f (it)(vt), x − vt〉].
c) Otherwise, σt+1 = σt + at+1, τt = at+1
σt+1, xt+1 = (1− τt)xt + τtvt .
Update `t+1(x) = `t(x) + at+1[f (0)(xt+1) + 〈∇f (0)(xt+1), x − xt+1〉].
Operation x0 = ./ ∈ E indicates that x0 is not chosen yet.
Yu. Nesterov New primal-dual methods for functional constraints 17/19
![Page 150: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/150.jpg)
Quasi-monotone method
Initialization : Define `0(x) ≡ 0, x0 = ./, and σ0 = 0.Iteration t ≥ 0 :
a) Set vt = arg minx∈Q{`t(x) + βtd(x)}, It
def= {i ∈ [1 : m] : f (i)(vt) ≥ ε}.
b) If It 6= ∅, then set xt+1 = xt , σt+1 = σt , and choose arbitrary it ∈ It .
Update `t+1(x) = `t(x) + at+1[f (it)(vt) + 〈∇f (it)(vt), x − vt〉].
c) Otherwise, σt+1 = σt + at+1, τt = at+1
σt+1, xt+1 = (1− τt)xt + τtvt .
Update `t+1(x) = `t(x) + at+1[f (0)(xt+1) + 〈∇f (0)(xt+1), x − xt+1〉].
Operation x0 = ./ ∈ E indicates that x0 is not chosen yet.
Yu. Nesterov New primal-dual methods for functional constraints 17/19
![Page 151: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/151.jpg)
Quasi-monotone method
Initialization : Define `0(x) ≡ 0, x0 = ./, and σ0 = 0.Iteration t ≥ 0 :
a) Set vt = arg minx∈Q{`t(x) + βtd(x)}, It
def= {i ∈ [1 : m] : f (i)(vt) ≥ ε}.
b) If It 6= ∅, then set xt+1 = xt , σt+1 = σt , and choose arbitrary it ∈ It .
Update `t+1(x) = `t(x) + at+1[f (it)(vt) + 〈∇f (it)(vt), x − vt〉].
c) Otherwise, σt+1 = σt + at+1, τt = at+1
σt+1, xt+1 = (1− τt)xt + τtvt .
Update `t+1(x) = `t(x) + at+1[f (0)(xt+1) + 〈∇f (0)(xt+1), x − xt+1〉].
Operation x0 = ./ ∈ E indicates that x0 is not chosen yet.
Yu. Nesterov New primal-dual methods for functional constraints 17/19
![Page 152: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/152.jpg)
Quasi-monotone method
Initialization : Define `0(x) ≡ 0, x0 = ./, and σ0 = 0.Iteration t ≥ 0 :
a) Set vt = arg minx∈Q{`t(x) + βtd(x)}, It
def= {i ∈ [1 : m] : f (i)(vt) ≥ ε}.
b) If It 6= ∅, then set xt+1 = xt , σt+1 = σt , and choose arbitrary it ∈ It .
Update `t+1(x) = `t(x) + at+1[f (it)(vt) + 〈∇f (it)(vt), x − vt〉].
c) Otherwise, σt+1 = σt + at+1, τt = at+1
σt+1, xt+1 = (1− τt)xt + τtvt .
Update `t+1(x) = `t(x) + at+1[f (0)(xt+1) + 〈∇f (0)(xt+1), x − xt+1〉].
Operation x0 = ./ ∈ E indicates that x0 is not chosen yet.
Yu. Nesterov New primal-dual methods for functional constraints 17/19
![Page 153: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/153.jpg)
Quasi-monotone method
Initialization : Define `0(x) ≡ 0, x0 = ./, and σ0 = 0.Iteration t ≥ 0 :
a) Set vt = arg minx∈Q{`t(x) + βtd(x)}, It
def= {i ∈ [1 : m] : f (i)(vt) ≥ ε}.
b) If It 6= ∅, then set xt+1 = xt , σt+1 = σt , and choose arbitrary it ∈ It .
Update `t+1(x) = `t(x) + at+1[f (it)(vt) + 〈∇f (it)(vt), x − vt〉].
c) Otherwise, σt+1 = σt + at+1, τt = at+1
σt+1, xt+1 = (1− τt)xt + τtvt .
Update `t+1(x) = `t(x) + at+1[f (0)(xt+1) + 〈∇f (0)(xt+1), x − xt+1〉].
Operation x0 = ./ ∈ E indicates that x0 is not chosen yet.
Yu. Nesterov New primal-dual methods for functional constraints 17/19
![Page 154: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/154.jpg)
Quasi-monotone method
Initialization : Define `0(x) ≡ 0, x0 = ./, and σ0 = 0.Iteration t ≥ 0 :
a) Set vt = arg minx∈Q{`t(x) + βtd(x)}, It
def= {i ∈ [1 : m] : f (i)(vt) ≥ ε}.
b) If It 6= ∅, then set xt+1 = xt , σt+1 = σt , and choose arbitrary it ∈ It .
Update `t+1(x) = `t(x) + at+1[f (it)(vt) + 〈∇f (it)(vt), x − vt〉].
c) Otherwise, σt+1 = σt + at+1, τt = at+1
σt+1, xt+1 = (1− τt)xt + τtvt .
Update `t+1(x) = `t(x) + at+1[f (0)(xt+1) + 〈∇f (0)(xt+1), x − xt+1〉].
Operation x0 = ./ ∈ E indicates that x0 is not chosen yet.
Yu. Nesterov New primal-dual methods for functional constraints 17/19
![Page 155: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/155.jpg)
Convergence result
Theorem. 1. All points xt 6= x./ are ε-feasible.
2. If all subgradients are bounded by M, then ∀x ∈ Q, t ≥ 0,
σt(f (0)(xt)− ε) + Atε ≤ `t(x) + βtd(x) + 12M2
t∑k=1
a2k
βk−1.
3. No later than Atε > βtD + 12M2
t∑k=1
a2k
βk−1, we get σt > 0 and
f (xt)− φ(λt) ≤ f (xt)− 1σt
minx∈Q
`t(x) ≤ ε.
Example: at ≡ 1, βt ≈√
t ⇒ t ≈ O(
1ε2
).
NB: this is true for the whole sequence!
Yu. Nesterov New primal-dual methods for functional constraints 18/19
![Page 156: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/156.jpg)
Convergence result
Theorem. 1. All points xt 6= x./ are ε-feasible.
2. If all subgradients are bounded by M, then ∀x ∈ Q, t ≥ 0,
σt(f (0)(xt)− ε) + Atε ≤ `t(x) + βtd(x) + 12M2
t∑k=1
a2k
βk−1.
3. No later than Atε > βtD + 12M2
t∑k=1
a2k
βk−1, we get σt > 0 and
f (xt)− φ(λt) ≤ f (xt)− 1σt
minx∈Q
`t(x) ≤ ε.
Example: at ≡ 1, βt ≈√
t ⇒ t ≈ O(
1ε2
).
NB: this is true for the whole sequence!
Yu. Nesterov New primal-dual methods for functional constraints 18/19
![Page 157: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/157.jpg)
Convergence result
Theorem. 1. All points xt 6= x./ are ε-feasible.
2. If all subgradients are bounded by M, then ∀x ∈ Q, t ≥ 0,
σt(f (0)(xt)− ε) + Atε ≤ `t(x) + βtd(x) + 12M2
t∑k=1
a2k
βk−1.
3. No later than Atε > βtD + 12M2
t∑k=1
a2k
βk−1, we get σt > 0 and
f (xt)− φ(λt) ≤ f (xt)− 1σt
minx∈Q
`t(x) ≤ ε.
Example: at ≡ 1, βt ≈√
t ⇒ t ≈ O(
1ε2
).
NB: this is true for the whole sequence!
Yu. Nesterov New primal-dual methods for functional constraints 18/19
![Page 158: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/158.jpg)
Convergence result
Theorem. 1. All points xt 6= x./ are ε-feasible.
2. If all subgradients are bounded by M, then ∀x ∈ Q, t ≥ 0,
σt(f (0)(xt)− ε) + Atε ≤ `t(x) + βtd(x) + 12M2
t∑k=1
a2k
βk−1.
3. No later than Atε > βtD + 12M2
t∑k=1
a2k
βk−1, we get σt > 0 and
f (xt)− φ(λt) ≤ f (xt)− 1σt
minx∈Q
`t(x) ≤ ε.
Example: at ≡ 1, βt ≈√
t ⇒ t ≈ O(
1ε2
).
NB: this is true for the whole sequence!
Yu. Nesterov New primal-dual methods for functional constraints 18/19
![Page 159: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/159.jpg)
Convergence result
Theorem. 1. All points xt 6= x./ are ε-feasible.
2. If all subgradients are bounded by M, then ∀x ∈ Q, t ≥ 0,
σt(f (0)(xt)− ε) + Atε ≤ `t(x) + βtd(x) + 12M2
t∑k=1
a2k
βk−1.
3. No later than Atε > βtD + 12M2
t∑k=1
a2k
βk−1, we get σt > 0 and
f (xt)− φ(λt) ≤ f (xt)− 1σt
minx∈Q
`t(x) ≤ ε.
Example: at ≡ 1, βt ≈√
t ⇒ t ≈ O(
1ε2
).
NB: this is true for the whole sequence!
Yu. Nesterov New primal-dual methods for functional constraints 18/19
![Page 160: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/160.jpg)
Convergence result
Theorem. 1. All points xt 6= x./ are ε-feasible.
2. If all subgradients are bounded by M, then ∀x ∈ Q, t ≥ 0,
σt(f (0)(xt)− ε) + Atε ≤ `t(x) + βtd(x) + 12M2
t∑k=1
a2k
βk−1.
3. No later than Atε > βtD + 12M2
t∑k=1
a2k
βk−1, we get σt > 0 and
f (xt)− φ(λt) ≤ f (xt)− 1σt
minx∈Q
`t(x) ≤ ε.
Example: at ≡ 1, βt ≈√
t ⇒ t ≈ O(
1ε2
).
NB: this is true for the whole sequence!
Yu. Nesterov New primal-dual methods for functional constraints 18/19
![Page 161: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/161.jpg)
Convergence result
Theorem. 1. All points xt 6= x./ are ε-feasible.
2. If all subgradients are bounded by M, then ∀x ∈ Q, t ≥ 0,
σt(f (0)(xt)− ε) + Atε ≤ `t(x) + βtd(x) + 12M2
t∑k=1
a2k
βk−1.
3. No later than Atε > βtD + 12M2
t∑k=1
a2k
βk−1, we get σt > 0 and
f (xt)− φ(λt) ≤ f (xt)− 1σt
minx∈Q
`t(x) ≤ ε.
Example: at ≡ 1, βt ≈√
t ⇒ t ≈ O(
1ε2
).
NB: this is true for the whole sequence!
Yu. Nesterov New primal-dual methods for functional constraints 18/19
![Page 162: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/162.jpg)
Conclusion
1. Optimal primal-dual solution can be approximated by simpleswitching subgradient schemes.
2. Approximations of dual multipliers have natural interpretation:relative importance of corresponding constraints during theadjustments process.
3. However, it has optimal worst-case efficiency estimate even ifthe dual optimal solution does not exist.
4. Many interesting questions (influence of smoothness, strongconvexity, etc.)
Thank you for your attention!
Yu. Nesterov New primal-dual methods for functional constraints 19/19
![Page 163: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/163.jpg)
Conclusion
1. Optimal primal-dual solution can be approximated by simpleswitching subgradient schemes.
2. Approximations of dual multipliers have natural interpretation:relative importance of corresponding constraints during theadjustments process.
3. However, it has optimal worst-case efficiency estimate even ifthe dual optimal solution does not exist.
4. Many interesting questions (influence of smoothness, strongconvexity, etc.)
Thank you for your attention!
Yu. Nesterov New primal-dual methods for functional constraints 19/19
![Page 164: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/164.jpg)
Conclusion
1. Optimal primal-dual solution can be approximated by simpleswitching subgradient schemes.
2. Approximations of dual multipliers have natural interpretation
:relative importance of corresponding constraints during theadjustments process.
3. However, it has optimal worst-case efficiency estimate even ifthe dual optimal solution does not exist.
4. Many interesting questions (influence of smoothness, strongconvexity, etc.)
Thank you for your attention!
Yu. Nesterov New primal-dual methods for functional constraints 19/19
![Page 165: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/165.jpg)
Conclusion
1. Optimal primal-dual solution can be approximated by simpleswitching subgradient schemes.
2. Approximations of dual multipliers have natural interpretation:relative importance of corresponding constraints during theadjustments process.
3. However, it has optimal worst-case efficiency estimate even ifthe dual optimal solution does not exist.
4. Many interesting questions (influence of smoothness, strongconvexity, etc.)
Thank you for your attention!
Yu. Nesterov New primal-dual methods for functional constraints 19/19
![Page 166: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/166.jpg)
Conclusion
1. Optimal primal-dual solution can be approximated by simpleswitching subgradient schemes.
2. Approximations of dual multipliers have natural interpretation:relative importance of corresponding constraints during theadjustments process.
3. However, it has optimal worst-case efficiency estimate
even ifthe dual optimal solution does not exist.
4. Many interesting questions (influence of smoothness, strongconvexity, etc.)
Thank you for your attention!
Yu. Nesterov New primal-dual methods for functional constraints 19/19
![Page 167: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/167.jpg)
Conclusion
1. Optimal primal-dual solution can be approximated by simpleswitching subgradient schemes.
2. Approximations of dual multipliers have natural interpretation:relative importance of corresponding constraints during theadjustments process.
3. However, it has optimal worst-case efficiency estimate even ifthe dual optimal solution does not exist.
4. Many interesting questions (influence of smoothness, strongconvexity, etc.)
Thank you for your attention!
Yu. Nesterov New primal-dual methods for functional constraints 19/19
![Page 168: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/168.jpg)
Conclusion
1. Optimal primal-dual solution can be approximated by simpleswitching subgradient schemes.
2. Approximations of dual multipliers have natural interpretation:relative importance of corresponding constraints during theadjustments process.
3. However, it has optimal worst-case efficiency estimate even ifthe dual optimal solution does not exist.
4. Many interesting questions (influence of smoothness, strongconvexity, etc.)
Thank you for your attention!
Yu. Nesterov New primal-dual methods for functional constraints 19/19
![Page 169: New primal-dual subgradient methods for Convex Problems ...lear.inrialpes.fr/workshop/osl2015/slides/osl2015_yurii.pdf · New primal-dual subgradient methods for Convex Problems with](https://reader034.fdocuments.us/reader034/viewer/2022042402/5f11d47e3ee38a1c4f11c2c5/html5/thumbnails/169.jpg)
Conclusion
1. Optimal primal-dual solution can be approximated by simpleswitching subgradient schemes.
2. Approximations of dual multipliers have natural interpretation:relative importance of corresponding constraints during theadjustments process.
3. However, it has optimal worst-case efficiency estimate even ifthe dual optimal solution does not exist.
4. Many interesting questions (influence of smoothness, strongconvexity, etc.)
Thank you for your attention!
Yu. Nesterov New primal-dual methods for functional constraints 19/19