Introduction to optimization · Introduction to optimization Introduction Mathematical optimization...

67
Introduction to optimization Introduction Mathematical optimization Appendix Introduction to optimization Olga Galinina olga.galinina@tut.fi, TG412 ELT-53656 Network Analysis and Dimensioning II Tampere University of Technology, Tampere, Finland February 10, 2016

Transcript of Introduction to optimization · Introduction to optimization Introduction Mathematical optimization...

Page 1: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Introduction to optimization

Olga [email protected], TG412

ELT-53656 Network Analysis and Dimensioning IITampere University of Technology, Tampere, Finland

February 10, 2016

Page 2: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Outline

1 Introduction

2 Mathematical optimization

3 Appendix

Page 3: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Content to be discussed during the next 5 weeks...

This is not Optimization Theory course...and we do not pretend it is.

Our goal: is to (re)visit basics of Methods of Optimizationto broaden horizons of good engineers:

Basic of mathematical optimization (today)

Linear programming

Convex programming

Mixed-integer and integer programming

(came from Graph Theory) Shortest path algorithms

Page 4: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

History...

Leonhard Euler (1707-1783):nothing at all takes place in the Universe in which some

rule of maximum or minimum does not appear

Page 5: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Where Optimization Problem Arises?

Methods of Optimization deals with finding the optimumsolution for a mathematical model, in most cases: findingmaxima and minima of functions subject to some constraints.

Manufacturing and transportation systems

Scheduling of goods for manufacturing

Transportation of goods over transportation networks

Scheduling of fleets of airplanes

Communication Systems

Design of communication systems

Flow of information across networks

financial systems, and much more...

Page 6: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Examples

Wireless network optimization

variables: power for wireless, investments in infrastructure

constraints: budget, max/min Tx power, distance, rate...

objective: system/user throughput, energy efficiency, etc

Data fitting

variables: model parameters

constraints: prior information, parameter limits

objective: measure of prediction error

Page 7: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Outline

1 Introduction

2 Mathematical optimization

3 Appendix

Page 8: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Mathematical Model

Consider an (mathematical) optimization problemminimize f (x), x ∈ Rn

subject to x ∈ Ω ⊂ Rn.

Definition

The function f (x) : Rn → R is a real-valued function, calledthe objective function, or cost function.

Definition

The variables x = [x1, ..., xn] are optimization variables.

Definition

Optimal point x0: f (x0) ≤ f (x),∀x ∈ Ω.

Page 9: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Mathematical Model

Consider an (mathematical) optimization problemminimize f (x), x ∈ Rn

subject to x ∈ Ω ⊂ Rn.

Definition

The function f (x) : Rn → R is a real-valued function, calledthe objective function, or cost function.

Definition

The variables x = [x1, ..., xn] are optimization variables.

Definition

Optimal point x0: f (x0) ≤ f (x),∀x ∈ Ω.

Page 10: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Mathematical Model

Consider an (mathematical) optimization problemminimize f (x), x ∈ Rn

subject to x ∈ Ω ⊂ Rn.

Definition

The function f (x) : Rn → R is a real-valued function, calledthe objective function, or cost function.

Definition

The variables x = [x1, ..., xn] are optimization variables.

Definition

Optimal point x0: f (x0) ≤ f (x),∀x ∈ Ω.

Page 11: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Mathematical Model

Consider an (mathematical) optimization problemminimize f (x), x ∈ Rn

subject to x ∈ Ω ⊂ Rn.

Definition

The function f (x) : Rn → R is a real-valued function, calledthe objective function, or cost function.

Definition

The variables x = [x1, ..., xn] are optimization variables.

Definition

Optimal point x0: f (x0) ≤ f (x),∀x ∈ Ω.

Page 12: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Mathematical Model

Consider an (mathematical) optimization problemminimize f (x), x ∈ Rn

subject to x ∈ Ω ⊂ Rn.

Definition

The function f (x) : Rn → R is a real-valued function, calledthe objective function, or cost function.

Definition

The variables x = [x1, ..., xn] are optimization variables.

Definition

Optimal point x0: f (x0) ≤ f (x),∀x ∈ Ω.

Page 13: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Mathematical Model

Consider an (mathematical) optimization problemminimize f (x), x ∈ Rn

subject to x ∈ Ω ⊂ Rn.

Definition

The function f (x) : Rn → R is a real-valued function, calledthe objective function, or cost function.

Definition

The variables x = [x1, ..., xn] are optimization variables.

Definition

Optimal point x0: f (x0) ≤ f (x),∀x ∈ Ω.

Page 14: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Constraint Set

Definition

The set Ω ⊂ Rn is the constraint set or feasible set/region.

Definition

The above problem is a general form of a constrainedoptimization problem. If Ω = Rn, the problem is we refer tothe problem unconstrained.

Ω takes form:x : hi (x) = 0, gj(x) < 0,

where hi (x), gi (x) are constraint functions

Definition

If Ω = , the problem is infeasible, otherwise feasible.

Page 15: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Constraint Set

Definition

The set Ω ⊂ Rn is the constraint set or feasible set/region.

Definition

The above problem is a general form of a constrainedoptimization problem. If Ω = Rn, the problem is we refer tothe problem unconstrained.

Ω takes form:x : hi (x) = 0, gj(x) < 0,

where hi (x), gi (x) are constraint functions

Definition

If Ω = , the problem is infeasible, otherwise feasible.

Page 16: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Constraint Set

Definition

The set Ω ⊂ Rn is the constraint set or feasible set/region.

Definition

The above problem is a general form of a constrainedoptimization problem. If Ω = Rn, the problem is we refer tothe problem unconstrained.

Ω takes form:x : hi (x) = 0, gj(x) < 0,

where hi (x), gi (x) are constraint functions

Definition

If Ω = , the problem is infeasible, otherwise feasible.

Page 17: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Constraint Set

Definition

The set Ω ⊂ Rn is the constraint set or feasible set/region.

Definition

The above problem is a general form of a constrainedoptimization problem. If Ω = Rn, the problem is we refer tothe problem unconstrained.

Ω takes form:x : hi (x) = 0, gj(x) < 0,

where hi (x), gi (x) are constraint functions

Definition

If Ω = , the problem is infeasible, otherwise feasible.

Page 18: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Constraint Set

Definition

The set Ω ⊂ Rn is the constraint set or feasible set/region.

Definition

The above problem is a general form of a constrainedoptimization problem. If Ω = Rn, the problem is we refer tothe problem unconstrained.

Ω takes form:x : hi (x) = 0, gj(x) < 0,

where hi (x), gi (x) are constraint functions

Definition

If Ω = , the problem is infeasible, otherwise feasible.

Page 19: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Constraint Set

Definition

The set Ω ⊂ Rn is the constraint set or feasible set/region.

Definition

The above problem is a general form of a constrainedoptimization problem. If Ω = Rn, the problem is we refer tothe problem unconstrained.

Ω takes form:x : hi (x) = 0, gj(x) < 0,

where hi (x), gi (x) are constraint functions

Definition

If Ω = , the problem is infeasible, otherwise feasible.

Page 20: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Classification

Problems:

Linear/nonlinear (defined by shape of f (x), hi (x), gj(x))

Convex/non-convex (defined by properties of f (x) and Ω)

Discrete/continuous/integer/mixed-integer/..

One-dimensional/multi-dimensional

One extremum/multiple extrema

Methods:

Direct methods (zero order) use information only on f (x)

allows analyzing functions, defined algorithmically

First-order methods use information on f (x),∇f (x)

e.g., gradient methods

Second-order methods use f (x),∇f (x),∇2f (x)

e.g., Newton methods and modifications

Page 21: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Solving Optimization Problems

General optimization problem

very difficult to solve

methods involve some compromise, e.g., very longcomputation time, or not always finding the solution

Exceptions: certain problem classes can be solved efficientlyand reliably

least-squares problems

linear programming problems

convex optimization problems

Page 22: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Least-squares problem

Example: linear regression bi =∑

j xjaij + ε

minimize ||Ax − b||22 =∑

i (∑

j aijxj − bi )2

Solving least-squares problems

analytical solution: x0 = (ATA)−1ATb

reliable and efficient algorithms and software

computation time proportional to n2k (A ∈ Rk×n); less ifstructured

a mature technology

Using least-squares

least-squares problems are easy to recognize

a few standard techniques increase flexibility (e.g.,including weights, adding regularization terms)

Page 23: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Least-squares problem

Example: linear regression bi =∑

j xjaij + ε

minimize ||Ax − b||22 =∑

i (∑

j aijxj − bi )2

Solving least-squares problems

analytical solution: x0 = (ATA)−1ATb

reliable and efficient algorithms and software

computation time proportional to n2k (A ∈ Rk×n); less ifstructured

a mature technology

Using least-squares

least-squares problems are easy to recognize

a few standard techniques increase flexibility (e.g.,including weights, adding regularization terms)

Page 24: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Least-squares problem

Example: linear regression bi =∑

j xjaij + ε

minimize ||Ax − b||22 =∑

i (∑

j aijxj − bi )2

Solving least-squares problems

analytical solution: x0 = (ATA)−1ATb

reliable and efficient algorithms and software

computation time proportional to n2k (A ∈ Rk×n); less ifstructured

a mature technology

Using least-squares

least-squares problems are easy to recognize

a few standard techniques increase flexibility (e.g.,including weights, adding regularization terms)

Page 25: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Least-squares problem

Example: linear regression bi =∑

j xjaij + ε

minimize ||Ax − b||22 =∑

i (∑

j aijxj − bi )2

Solving least-squares problems

analytical solution: x0 = (ATA)−1ATb

reliable and efficient algorithms and software

computation time proportional to n2k (A ∈ Rk×n); less ifstructured

a mature technology

Using least-squares

least-squares problems are easy to recognize

a few standard techniques increase flexibility (e.g.,including weights, adding regularization terms)

Page 26: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Linear programming

minimize cT x =∑

j cjxjsubject to aTi x ≤ bi , i = 1, ...,m

Solving linear programs

no analytical formula for solution

reliable and efficient algorithms and software

computation time proportional to n2m if m ≥ n; less withstructure

a mature technology

Using linear programming

easy to recognize, but sometimes not as easy as LSproblems

a few standard tricks used to convert problems into linearprograms (e.g., problems involving l1- or l∞-norms,piecewise-linear functions)

Page 27: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Linear programming

minimize cT x =∑

j cjxjsubject to aTi x ≤ bi , i = 1, ...,m

Solving linear programs

no analytical formula for solution

reliable and efficient algorithms and software

computation time proportional to n2m if m ≥ n; less withstructure

a mature technology

Using linear programming

easy to recognize, but sometimes not as easy as LSproblems

a few standard tricks used to convert problems into linearprograms (e.g., problems involving l1- or l∞-norms,piecewise-linear functions)

Page 28: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Linear programming

minimize cT x =∑

j cjxjsubject to aTi x ≤ bi , i = 1, ...,m

Solving linear programs

no analytical formula for solution

reliable and efficient algorithms and software

computation time proportional to n2m if m ≥ n; less withstructure

a mature technology

Using linear programming

easy to recognize, but sometimes not as easy as LSproblems

a few standard tricks used to convert problems into linearprograms (e.g., problems involving l1- or l∞-norms,piecewise-linear functions)

Page 29: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Convex optimization problem

minimize f (x)subject to gi (x) ≤ bi , i = 1, ...,m

objective and constraint functions are convex:

gi (α1x + α2y) ≤ α1gi (x) + α2gi (y),

if α1 + α2 = 1, α1 ≥ 0, α1 ≥ 0.

includes least-squares problems and linear programs asspecial cases

Page 30: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Convex optimization problem

minimize f (x)subject to gi (x) ≤ bi , i = 1, ...,m

objective and constraint functions are convex:

gi (α1x + α2y) ≤ α1gi (x) + α2gi (y),

if α1 + α2 = 1, α1 ≥ 0, α1 ≥ 0.

includes least-squares problems and linear programs asspecial cases

Page 31: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Convex optimization problems

Solving convex optimization problems

no analytical solution

reliable and efficient algorithms

computation time (roughly) proportional tomaxn3, n2m,F, where F is cost of evaluating f (x) andits first and second derivatives

almost a technology

Using convex optimization

often difficult to recognize (requires proof of convexity)

many tricks for transforming problems into convex form

surprisingly many problems can be solved via convexoptimization

Page 32: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Convex optimization problems

Solving convex optimization problems

no analytical solution

reliable and efficient algorithms

computation time (roughly) proportional tomaxn3, n2m,F, where F is cost of evaluating f (x) andits first and second derivatives

almost a technology

Using convex optimization

often difficult to recognize (requires proof of convexity)

many tricks for transforming problems into convex form

surprisingly many problems can be solved via convexoptimization

Page 33: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Nonlinear optimization

Traditional techniques for general nonconvex problems involvecompromises

Local optimization methods (nonlinear programming)

find a point that minimizes f among feasible points near it

fast, can handle large problems

require initial guess

provide no information about distance to (global) optimum

Global optimization methods

find the (global) solution

worst-case complexity grows exponentially with problemsize

These algorithms are often based on solving convexsubproblems

Page 34: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Nonlinear optimization

Traditional techniques for general nonconvex problems involvecompromisesLocal optimization methods (nonlinear programming)

find a point that minimizes f among feasible points near it

fast, can handle large problems

require initial guess

provide no information about distance to (global) optimum

Global optimization methods

find the (global) solution

worst-case complexity grows exponentially with problemsize

These algorithms are often based on solving convexsubproblems

Page 35: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Nonlinear optimization

Traditional techniques for general nonconvex problems involvecompromisesLocal optimization methods (nonlinear programming)

find a point that minimizes f among feasible points near it

fast, can handle large problems

require initial guess

provide no information about distance to (global) optimum

Global optimization methods

find the (global) solution

worst-case complexity grows exponentially with problemsize

These algorithms are often based on solving convexsubproblems

Page 36: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Local and global extrema

The following theorem helps define whether local solution is aglobal one:

Theorem (Weierstrass)

Continuous real-valued function f (x) defined in non-emptycompact set S attains a maximum and a minimumxmin /max ∈ S .

Reminder: compact set S ∈ Rn = the closed and bounded set

Page 37: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Outline

1 Introduction

2 Mathematical optimization

3 Appendix

Page 38: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Affine and Convex Sets

Definition

S ⊂ Rn is affine if [x , y ∈ S , α ∈ R]⇒ αx + (1− α)y ∈ S

Definition

S ⊂ Rn is convex if for all [x , y ∈ S , 0 < α < 1] ⇒z = αx + (1− α)y ∈ S (z a convex combination of x and y).

If x1, ..., xm ∈ Rn,∑

j αj = 1, αj > 0, thenx = α1x1 + ...+ α1x1 is a convex combination of x1, ..., xm.The intersection of (any number of) convex sets is convex

Page 39: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Affine and Convex Sets

Definition

S ⊂ Rn is affine if [x , y ∈ S , α ∈ R]⇒ αx + (1− α)y ∈ S

Definition

S ⊂ Rn is convex if for all [x , y ∈ S , 0 < α < 1] ⇒z = αx + (1− α)y ∈ S (z a convex combination of x and y).

If x1, ..., xm ∈ Rn,∑

j αj = 1, αj > 0, thenx = α1x1 + ...+ α1x1 is a convex combination of x1, ..., xm.The intersection of (any number of) convex sets is convex

Page 40: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Affine and Convex Sets

Definition

S ⊂ Rn is affine if [x , y ∈ S , α ∈ R]⇒ αx + (1− α)y ∈ S

Definition

S ⊂ Rn is convex if for all [x , y ∈ S , 0 < α < 1] ⇒z = αx + (1− α)y ∈ S (z a convex combination of x and y).

If x1, ..., xm ∈ Rn,∑

j αj = 1, αj > 0, thenx = α1x1 + ...+ α1x1 is a convex combination of x1, ..., xm.The intersection of (any number of) convex sets is convex

Page 41: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Affine and Convex Sets

Definition

S ⊂ Rn is affine if [x , y ∈ S , α ∈ R]⇒ αx + (1− α)y ∈ S

Definition

S ⊂ Rn is convex if for all [x , y ∈ S , 0 < α < 1] ⇒z = αx + (1− α)y ∈ S (z a convex combination of x and y).

If x1, ..., xm ∈ Rn,∑

j αj = 1, αj > 0, thenx = α1x1 + ...+ α1x1 is a convex combination of x1, ..., xm.The intersection of (any number of) convex sets is convex

Page 42: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Affine and Convex Sets

Definition

S ⊂ Rn is affine if [x , y ∈ S , α ∈ R]⇒ αx + (1− α)y ∈ S

Definition

S ⊂ Rn is convex if for all [x , y ∈ S , 0 < α < 1] ⇒z = αx + (1− α)y ∈ S (z a convex combination of x and y).

If x1, ..., xm ∈ Rn,∑

j αj = 1, αj > 0, thenx = α1x1 + ...+ α1x1 is a convex combination of x1, ..., xm.The intersection of (any number of) convex sets is convex

Page 43: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Compact Sets

Let Bδ(x0) denote the open ball of radius δ centered at thepoint x : Bδ(x0) = x : ||x − x0|| < δ.

Definition

Set S ⊂ Rn is said to be open if for each point x0 ∈ S there isδ such that Bδ(x0). A set S ⊂ Rn is said to be closed if itscomplement Rn \ S is open.

Definition

Set S is compact if each of its open covers has a finitesubcover: ∀Cii∈A, S ⊂ Cii∈A ∃ finite J : S ⊂ Cjj∈J .

Alternative: every sequence in S has a convergentsubsequence, whose limit lies in S .Note: If S ⊂ Rn, closed and bounded, then S - compact(Heine-Borel theorem).

Page 44: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Compact Sets

Let Bδ(x0) denote the open ball of radius δ centered at thepoint x : Bδ(x0) = x : ||x − x0|| < δ.

Definition

Set S ⊂ Rn is said to be open if for each point x0 ∈ S there isδ such that Bδ(x0). A set S ⊂ Rn is said to be closed if itscomplement Rn \ S is open.

Definition

Set S is compact if each of its open covers has a finitesubcover: ∀Cii∈A, S ⊂ Cii∈A ∃ finite J : S ⊂ Cjj∈J .

Alternative: every sequence in S has a convergentsubsequence, whose limit lies in S .Note: If S ⊂ Rn, closed and bounded, then S - compact(Heine-Borel theorem).

Page 45: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Compact Sets

Let Bδ(x0) denote the open ball of radius δ centered at thepoint x : Bδ(x0) = x : ||x − x0|| < δ.

Definition

Set S ⊂ Rn is said to be open if for each point x0 ∈ S there isδ such that Bδ(x0). A set S ⊂ Rn is said to be closed if itscomplement Rn \ S is open.

Definition

Set S is compact if each of its open covers has a finitesubcover: ∀Cii∈A, S ⊂ Cii∈A ∃ finite J : S ⊂ Cjj∈J .

Alternative: every sequence in S has a convergentsubsequence, whose limit lies in S .Note: If S ⊂ Rn, closed and bounded, then S - compact(Heine-Borel theorem).

Page 46: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Compact Sets

Let Bδ(x0) denote the open ball of radius δ centered at thepoint x : Bδ(x0) = x : ||x − x0|| < δ.

Definition

Set S ⊂ Rn is said to be open if for each point x0 ∈ S there isδ such that Bδ(x0). A set S ⊂ Rn is said to be closed if itscomplement Rn \ S is open.

Definition

Set S is compact if each of its open covers has a finitesubcover: ∀Cii∈A, S ⊂ Cii∈A ∃ finite J : S ⊂ Cjj∈J .

Alternative: every sequence in S has a convergentsubsequence, whose limit lies in S .Note: If S ⊂ Rn, closed and bounded, then S - compact(Heine-Borel theorem).

Page 47: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Compact Sets

Let Bδ(x0) denote the open ball of radius δ centered at thepoint x : Bδ(x0) = x : ||x − x0|| < δ.

Definition

Set S ⊂ Rn is said to be open if for each point x0 ∈ S there isδ such that Bδ(x0). A set S ⊂ Rn is said to be closed if itscomplement Rn \ S is open.

Definition

Set S is compact if each of its open covers has a finitesubcover: ∀Cii∈A, S ⊂ Cii∈A ∃ finite J : S ⊂ Cjj∈J .

Alternative: every sequence in S has a convergentsubsequence, whose limit lies in S .Note: If S ⊂ Rn, closed and bounded, then S - compact(Heine-Borel theorem).

Page 48: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Compact Sets

Let Bδ(x0) denote the open ball of radius δ centered at thepoint x : Bδ(x0) = x : ||x − x0|| < δ.

Definition

Set S ⊂ Rn is said to be open if for each point x0 ∈ S there isδ such that Bδ(x0). A set S ⊂ Rn is said to be closed if itscomplement Rn \ S is open.

Definition

Set S is compact if each of its open covers has a finitesubcover: ∀Cii∈A, S ⊂ Cii∈A ∃ finite J : S ⊂ Cjj∈J .

Alternative: every sequence in S has a convergentsubsequence, whose limit lies in S .Note: If S ⊂ Rn, closed and bounded, then S - compact(Heine-Borel theorem).

Page 49: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Compact Sets

Let Bδ(x0) denote the open ball of radius δ centered at thepoint x : Bδ(x0) = x : ||x − x0|| < δ.

Definition

Set S ⊂ Rn is said to be open if for each point x0 ∈ S there isδ such that Bδ(x0). A set S ⊂ Rn is said to be closed if itscomplement Rn \ S is open.

Definition

Set S is compact if each of its open covers has a finitesubcover: ∀Cii∈A, S ⊂ Cii∈A ∃ finite J : S ⊂ Cjj∈J .

Alternative: every sequence in S has a convergentsubsequence, whose limit lies in S .Note: If S ⊂ Rn, closed and bounded, then S - compact(Heine-Borel theorem).

Page 50: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Convex functions

Definition

Let C ⊂ Rn be a nonempty convex set. Then f : C → R isconvex (on C) if for all x , y ∈ C and all α ∈ (0, 1):f (αx + (1− α)y) ≤ αf (x) + (1− α)f (y)

If strict inequality holds whenever x 6= y , then f is said to bestrictly convex. The negative of a (strictly) convex function iscalled a (strictly) concave function.

Page 51: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Convex functions

Definition

Let C ⊂ Rn be a nonempty convex set. Then f : C → R isconvex (on C) if for all x , y ∈ C and all α ∈ (0, 1):f (αx + (1− α)y) ≤ αf (x) + (1− α)f (y)

If strict inequality holds whenever x 6= y , then f is said to bestrictly convex. The negative of a (strictly) convex function iscalled a (strictly) concave function.

Page 52: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Convex functions

Definition

Let C ⊂ Rn be a nonempty convex set. Then f : C → R isconvex (on C) if for all x , y ∈ C and all α ∈ (0, 1):f (αx + (1− α)y) ≤ αf (x) + (1− α)f (y)

If strict inequality holds whenever x 6= y , then f is said to bestrictly convex. The negative of a (strictly) convex function iscalled a (strictly) concave function.

Page 53: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Convex functions

Definition

Let C ⊂ Rn be a nonempty convex set. Then f : C → R isconvex (on C) if for all x , y ∈ C and all α ∈ (0, 1):f (αx + (1− α)y) ≤ αf (x) + (1− α)f (y)

If strict inequality holds whenever x 6= y , then f is said to bestrictly convex. The negative of a (strictly) convex function iscalled a (strictly) concave function.

Page 54: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Convex functions

Definition

Let C ⊂ Rn be a nonempty convex set. Then f : C → R isconvex (on C) if for all x , y ∈ C and all α ∈ (0, 1):f (αx + (1− α)y) ≤ αf (x) + (1− α)f (y)

If strict inequality holds whenever x 6= y , then f is said to bestrictly convex. The negative of a (strictly) convex function iscalled a (strictly) concave function.

Page 55: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Convex functions

nonnegative multiple: αf is convex if f is convex, α ≥ 0

sum: f1 + f2 convex if f1, f2 convex (extends to infinitesums, integrals)

composition with affine function: f (Ax + b) is convex if fis convex

Some univariate convex functions:1. exponential f (x) = eαx (for all real α)2. powers f (x) = xp if x ≥ 0 and 1 ≤ p <∞3. powers of abs value f (x) = |x |p, if x > 0 and −∞ < p ≤ 0Concave:1. powers f (x) = xp if x ≥ 0 and 0 ≤ p ≤ 12. logarithm: f (x) = log x if x > 0.

Page 56: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Convex functions

nonnegative multiple: αf is convex if f is convex, α ≥ 0

sum: f1 + f2 convex if f1, f2 convex (extends to infinitesums, integrals)

composition with affine function: f (Ax + b) is convex if fis convex

Some univariate convex functions:1. exponential f (x) = eαx (for all real α)2. powers f (x) = xp if x ≥ 0 and 1 ≤ p <∞3. powers of abs value f (x) = |x |p, if x > 0 and −∞ < p ≤ 0Concave:1. powers f (x) = xp if x ≥ 0 and 0 ≤ p ≤ 12. logarithm: f (x) = log x if x > 0.

Page 57: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Differentials

f is differentiable if domain of (f ) is open and the gradient off :

∇f (x) ≡(∂f

∂x1, ...,

∂f

∂xn

)T

,∃ ∀x ∈ dom (f )

f is twice differentiable if domain of f is open and theHessian of f :

H ≡ D2f (x) =

∂2f∂x2

1... ∂2f

∂x1∂xn

...∂2f

∂xn∂x1... ∂2f

∂x2n

,∃∀x ∈ dom (f )

Note:Not all convex functions are differentiable.

Page 58: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Differentials

f is differentiable if domain of (f ) is open and the gradient off :

∇f (x) ≡(∂f

∂x1, ...,

∂f

∂xn

)T

,∃ ∀x ∈ dom (f )

f is twice differentiable if domain of f is open and theHessian of f :

H ≡ D2f (x) =

∂2f∂x2

1... ∂2f

∂x1∂xn

...∂2f

∂xn∂x1... ∂2f

∂x2n

,∃∀x ∈ dom (f )

Note:

Not all convex functions are differentiable.

Page 59: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Differentials

f is differentiable if domain of (f ) is open and the gradient off :

∇f (x) ≡(∂f

∂x1, ...,

∂f

∂xn

)T

,∃ ∀x ∈ dom (f )

f is twice differentiable if domain of f is open and theHessian of f :

H ≡ D2f (x) =

∂2f∂x2

1... ∂2f

∂x1∂xn

...∂2f

∂xn∂x1... ∂2f

∂x2n

,∃∀x ∈ dom (f )

Note:Not all convex functions are differentiable.

Page 60: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

First-order condition

Theorem (gradient inequality)

Differentiable f is convex on convex C ⊂ Rn i.f.f. ∀x , y ∈ Cf (y) ≥ f (x) + (∇f (x))T (y − x).

Theorem

Minimizing differentiable convex function f (x) s.t. x ∈ C⇔

Find x∗ ∈ C such that (∇f (x∗))T (y − x∗) ≥ 0 for all y ∈ C(variational inequality problem)

Page 61: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

First-order condition

Theorem (gradient inequality)

Differentiable f is convex on convex C ⊂ Rn i.f.f. ∀x , y ∈ Cf (y) ≥ f (x) + (∇f (x))T (y − x).

Theorem

Minimizing differentiable convex function f (x) s.t. x ∈ C⇔

Find x∗ ∈ C such that (∇f (x∗))T (y − x∗) ≥ 0 for all y ∈ C(variational inequality problem)

Page 62: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

First-order condition

Theorem (gradient inequality)

Differentiable f is convex on convex C ⊂ Rn i.f.f. ∀x , y ∈ Cf (y) ≥ f (x) + (∇f (x))T (y − x).

Theorem

Minimizing differentiable convex function f (x) s.t. x ∈ C⇔

Find x∗ ∈ C such that (∇f (x∗))T (y − x∗) ≥ 0 for all y ∈ C(variational inequality problem)

Page 63: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

First-order condition

Theorem (gradient inequality)

Differentiable f is convex on convex C ⊂ Rn i.f.f. ∀x , y ∈ Cf (y) ≥ f (x) + (∇f (x))T (y − x).

Theorem

Minimizing differentiable convex function f (x) s.t. x ∈ C⇔

Find x∗ ∈ C such that (∇f (x∗))T (y − x∗) ≥ 0 for all y ∈ C(variational inequality problem)

Page 64: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Second-order condition

Theorem

Twice differentiable f is convex on C ⊂ Rn i.f.f Hessian matrix∇2f is positive semidefinite for all x ∈ C .

Note: If ∇2f (x) is positive definite for all x ∈ C , then f isstrictly convex on C . The converse is false.Example: consider the function f (x) = x4

Page 65: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Second-order condition

Theorem

Twice differentiable f is convex on C ⊂ Rn i.f.f Hessian matrix∇2f is positive semidefinite for all x ∈ C .

Note: If ∇2f (x) is positive definite for all x ∈ C , then f isstrictly convex on C . The converse is false.Example: consider the function f (x) = x4

Page 66: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Second-order condition

Theorem

Twice differentiable f is convex on C ⊂ Rn i.f.f Hessian matrix∇2f is positive semidefinite for all x ∈ C .

Note: If ∇2f (x) is positive definite for all x ∈ C , then f isstrictly convex on C . The converse is false.Example: consider the function f (x) = x4

Page 67: Introduction to optimization · Introduction to optimization Introduction Mathematical optimization Appendix Content to be discussed during the next 5 weeks... This is not Optimization

Introductionto

optimization

Introduction

Mathematicaloptimization

Appendix

Second-order condition

Theorem

Twice differentiable f is convex on C ⊂ Rn i.f.f Hessian matrix∇2f is positive semidefinite for all x ∈ C .

Note: If ∇2f (x) is positive definite for all x ∈ C , then f isstrictly convex on C . The converse is false.Example: consider the function f (x) = x4