AM 121: Introduction to Optimization Models and Methods...

14
AM 121: Introduction to Optimization Models and Methods Lecture 4: Basic Feasible Solutions, Extreme points Yiling Chen Harvard SEAS 1 Lesson Plan Polyhedron, convexity and optimality Basis, basic feasible solutions (BFS), optimality Optimization Geometry Algebra Jensen & Bard: 3.1, 3.3 2

Transcript of AM 121: Introduction to Optimization Models and Methods...

AM 121: Introduction to OptimizationModels and Methods

Lecture 4: Basic Feasible Solutions, Extreme points

Yiling ChenHarvard SEAS

1

Lesson Plan

Polyhedron, convexity and optimality

Basis, basic feasible solutions (BFS), optimality

Optimization ↔ Geometry ↔ Algebra

Jensen & Bard: 3.1, 3.3

2

Notation

Vector will usually mean column vector

m× n matrix:

A =(A1 A2 . . . An

)=

a1>

. . .am>

Ai is the i-th column, aj

> is the j-th row.

AB is matrix formed from columns in set B (e.g., B = {1, 3}).S \ T is set of elements of S that do not belong to T

Rn is set of n-dimensional vectors

x = (x1, x2, . . . , xn)> ∈ Rn; Ax =

∑i xiAi

x ≥ 0 and x > 0: every component of x is nonnegative(respectively, positive)

3

Recall: LP Standard Forms

Standard equality form:

max c>x

s.t. Ax = b

x ≥ 0

maximization problem

m equality constraints

n inequality constraints

Standard inequality form:

max c>x

s.t. Ax ≤ b

x ≥ 0

Can transform any LP into standard forms.

4

Convex Sets

Given y, z ∈ Rn, y 6= z define open line segment(y, z) = {λy + (1− λ)z | 0 < λ < 1}.Definition: A set F ⊆ Rn is convex wheny, z ∈ F,y 6= z ⇒ (y, z) ⊆ F

5

Polyhedron

Definition

A polyhedron is a set that can be described in the formP = {x ∈ Rn | Ax ≤ b}.

Fact

The feasible set of any LP can be described as a polyhedron.

6

Halfspaces

Definition

Let a ∈ Rn,a 6= 0, and let b be a scalar. Then,

{x ∈ Rn | a>x = b} is a set of points that forms a hyperplane

{x ∈ Rn | a>x ≤ b} is a set of points that forms a halfspace

Note: vector a is perpendicular to the hyperplane

If x and y are on the same hyperplane, then a>x = a>y anda>(x− y) = 0 and a is orthogonal to any vector on thehyperplane.

x− y is a vector on the hyperplane

7

2 + ≥4halfspace

(0,4)

2x1+x2≥4

(2 1) vector(2,1) vector

(2 0)(2,0)

2x1+x2=4hyperplane

8

Theorem

A halfspace is convex.

Theorem

The intersection of convex sets is convex.

Theorem

Every polyhedron is an intersection of halfspaces, and convex.

9

Intuition: Optimality

In two dimensions,convexity ≡ angle insidefeasible region at corner is≤ 180o.

Let v denote value ofoptimal solution x∗.Hyperplane c>x∗ = v“separates” all points in thefeasible polyhedron from x∗,showing they have lessobjective value.

10

Extreme Points

Definition

Let P be a polyhedron. x ∈ P is an extreme point if we cannot findy, z ∈ P , both different from x, such that x ∈ (y, z).

11

Existence of Extreme Points

Definition

A polyhedron contains a line if there exists a x ∈ P and a nonzerovector d ∈ P s.t. x + λd ∈ P for all scalars λ.

Theorem

A polyhedron P contains an extreme point if and only if it does notcontain a line.

e.g., P contains a line, but Q does not (figure from MIT 6.251).

12

Optimality of Extreme Points

Theorem

Consider max c>x over a polyhedron P . Suppose P has at least oneextreme point, and there exists an optimal solution. Then there existsan optimal, extreme solution.

Proof.

P = {x ∈ Rn|Ax ≤ b}, x∗ optimal with v = c>x∗.

Define Q = {x ∈ Rn|Ax ≤ b, c>x = v} ⊆ P ; a polyhedron with anextreme point.

Let x′ be an extreme point of Q. Suppose for contradiction thatx′ is not extreme in P . Then, x′ = λy + (1− λ)z, for y, z ∈ P ,not equal to x′, with λ ∈ (0, 1).

We have c>x′ = λc>y + (1− λ)c>z = v, and by optimality of vwe have c>y ≤ v and c>z ≤ v. So, c>y = c>z = v, and z,y ∈ Q.But then x′ is not an extreme point in Q. Contradiction.

13

How can an LP have an optimal solution but no extreme point?

How can an LP have an extreme point but no optimal solution?

14

Overview

Optimization ↔ Geometry ↔ Algebra

15

Review: Basis

The span of vectors y1, . . . ,yK in Rm is the set of vectors z ofthe form z =

∑k dk · yk, where dk is a scalar.

Vectors y1, . . . ,yK are linearly-independent if and only if theonly solution of

∑k dk · yk = 0 is dk = 0 for all k. (If linearly

dependent, then one can be written as the linear combination ofthe others.)

A basis of Rm is a collection of linearly-independent vectorsin Rm that span Rm. (Any m linearly-independent vectors willprovide a basis.)

Examples

Consider R2. What about:

{(1, 0)> , (0, 1)>}{(1, 0)> , (1, 1)>}{(1, 0)> , (2, 0)>}

16

Review: Matrix Properties

Columns of A = (A1 . . .An) are linearly independent if andonly if the only solution of Ax = 0 is x = 0.Example: (

1 11 0

)?

(0 02 1

)?

Columns of m by n matrix A span Rm if Ax = b has a solutionfor every b ∈ Rm.Example: (

1 0 1 01 2 0 1

)?

(0 02 1

)?

Rank of matrix A is the size of largest collection of linearlyindependent columns (the column rank)I equivalently, the size of largest collection of linearly independent

rows (the row rank)I Fact: column rank = row rank ≤ min(m,n)

17

Review: Invertible Matrices

m by m matrix A is invertible if there is some A′ such thatAA′ = A′A = Im (the identity matrix, 0s off-diagonal, 1son-diagonal.)

Following properties are equivalent for a square matrix:I A is invertibleI columns of A spanI columns of A are linearly independentI for every b ∈ Rm, Ax = b has a unique solution

18

Basis of a Matrix

Consider an m× n matrix A

B is a basis for A if AB is invertible (the columns of AB arelinearly independent and span Rm). Need |B| = m.

Example: A =

(1 0 1 01 2 0 1

)For B = {1, 3}, obtain AB =

(1 11 0

).

AB is invertible, and so {1, 3} is a basis for A.

Extension rule: If columns of A span, and columns of AC arelinearly independent for |C| < m, can extend C to form a basis Bfor A.

19

Basic Solutions

Definition

x is a basic solution to Ax = b if the vectors of A with xi 6= 0 arelinearly independent.

For basis B, let B′ denote {1, . . . , n} \B. Call variables xB basicand variables xB′ nonbasic.I Example: for B = {1, 3}, xB = (x1, x3), xB′ = (x2, x4).

Definition

The basic solution corresponding to basis B is obtained by settingxB′ = 0 and solving for xB.

ABxB + AB′xB′ = ABxB + 0 = b, and so xB = A−1B b. Uniquesolution (AB invertible). Some xB values may be 0!

Fact

x is a basic solution if and only if there is a basis B s.t. non-basicvariables xB′ = 0. (via extension rule).

20

Example: Basic Solutions

max x1 + x2

s.t. x1 ≤ 2

x1 + 2x2 ≤ 4

x1, x2 ≥ 0

Standard equality form (alsocanonical here):

max x1 + x2

s.t x1 +x3 = 2

x1 + 2x2 + x4 = 4

x1, x2, x3, x4 ≥ 0

Five bases: {1, 2}, {1, 3}, {1, 4}, {2, 3}, {3, 4}.Corresponding basic solutions: (x1, x2, x3, x4)

> = (2, 1, 0, 0)>,(4, 0,−2, 0)>, (2, 0, 0, 2)>, (0, 2, 2, 0)>, (0, 0, 2, 4)>

All feasible?

21

Basic Feasible Solution (BFS)

Constraints Ax = b, and x ≥ 0.

Definition

A basic feasible solution is basic and feasible.

In the example, there are 4 BFS, each of which corresponds to afeasible solution (x1, x2)

> = (2, 1)>, (2, 0)>, (0, 2)>, (0, 0)> of theoriginal LP. The other basic solution is infeasible, not x ≥ 0.

22

BFS occur at the “corners” of the feasible region

Geometrically, optimization finds a “corner” solution

Corners correspond exactly to BFS

Optimization ↔ geometry ↔ algebra

Let’s prove the correspondence between extreme points and BFS

23

Main LP Assumptions

max c>x s.t. Ax = b,x ≥ 0

A has full row rank:I wlog because if not then have redundancy because one row can be

written as linear combination of other rows

Less rows than columns (m < n)I wlog because this makes it an optimization problem!I n variables, m equations. (n−m) is the “degree of freedom”

Columns of A span Rm

I meaning Ax = b has a solution for every bI follows from full row rank, and thus column rank = m

24

BFS and Extreme Points

Theorem

Consider P = {x : Ax = b,x ≥ 0}, for A with columns that span.Then extreme points of P are exactly the BFS of P .

Proof.

(⇐) A BFS is an extreme point:

Suppose x is a BFS corresponding to basis B.

For contradiction, assume x is not an extreme point, i.e.x = λy + (1− λ)z for z,y ∈ P , y 6= z and some λ ∈ (0, 1).

For all i /∈ B, we have xi = 0, and because y, z ≥ 0, we must haveyi = zi = 0 for all i /∈ B.

Conclude that y and z are the same BFS as x. Why? They havethe same basic and non-basic variables.

Contradiction!

25

BFS and Extreme Points

Proof.

(⇒) An extreme point is a BFS:

Let x be an extreme point. Assume for contradiction it is notbasic. Let C = {i : xi > 0}, with |C| = k.

AC must not have linearly independent columns:I if k ≤ m and cols linearly ind., then x would be basic.I if k > m, would imply column rank larger than m!

Let d′ denote a non-zero vector in Rk such that ACd′ = 0. Define

d ∈ Rn with di = d′i for i ∈ C, and di = 0 otherwise.

For small ε > 0, points x± εd are distinct (since d 6= 0) and bothin P (since ACd

′ = 0 and thus Ad = 0, and with x± εd ≥ 0 forsmall ε since di = 0 whenever xi = 0).

A contradiction with x being an extreme point.

26

Summary

If there’s an extreme point, and an optimal solution, then there’san optimal solution at an extreme point.

All non-zero variables in a basic solution x correspond to linearlyindependent columns of A. There is also a basis B (|B| = m, AB

rank m) s.t. non-basic xB′ = 0.

Extreme points of the polyhedron are exactly the basic feasiblesolutions (BFS) (basic and x ≥ 0)

Suggests we can solve LPs by searching through BFS. But can wedo this efficiently?

27