Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric...

27
Algorithmic Convex Geometry August 2011

Transcript of Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric...

Page 1: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

Algorithmic Convex Geometry

August 2011

Page 2: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

2

Page 3: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

Contents

1 Overview 5

1.1 Learning by random sampling . . . . . . . . . . . . . . . . . . 5

2 The Brunn-Minkowski Inequality 7

2.1 The inequality . . . . . . . . . . . . . . . . . . . . . . . . . . 82.1.1 Proving the Brunn-Minkowski inequality . . . . . . . . 8

2.2 Grunbaum’s inequality . . . . . . . . . . . . . . . . . . . . . . 10

3 Convex Optimization 15

3.1 Optimization in Euclidean space . . . . . . . . . . . . . . . . 153.1.1 Reducing Optimization to Feasibility . . . . . . . . . . 183.1.2 Feasibility Problem . . . . . . . . . . . . . . . . . . . . 183.1.3 Membership Oracle . . . . . . . . . . . . . . . . . . . . 27

3

Page 4: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

4 CONTENTS

Page 5: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

Chapter 1

Overview

Algorithmic problems in geometry often become tractable with the assump-tion of convexity. Optimization, volume computation, geometric learningand finding the centroid are all examples of problems that are significantlyeasier for convex sets.

We will study this phenomenon in depth, pursuing three tracks thatare closely connected to each other. The first is the theory of geometricinequalities. We begin with classical topics such as the Brunn-Minkowskiinequality, and later deal with more recent developments such as isoperimet-ric theorems for convex bodies and their extensions to logconcave functions.The second track is motivated by the problem of sampling a geometric dis-tribution by a random walk. Here we will develop some general tools anduse them to analyze geometric random walks. The inequalities of the firsttrack play a key role in bounding the rate of convergence of these walks.The last track is the connection between sampling and various algorithmicproblems, most notably, that of computing the volume of a convex body (ormore generally, integrating a logconcave function). Somewhat surprisingly,random sampling will be a common and essential feature of polynomial-timealgorithms for these problems. In some cases, including the volume problem,sampling by a random walk is the only known way to get a polynomial-timealgorithm.

1.1 Learning by random sampling

We will now see our first example of reducing an algorithmic problem to arandom sampling problem. In a typical learning problem, we are presentedwith samples X1, X2, ... from the domain of the function, and have to guess

5

Page 6: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

6 CHAPTER 1. OVERVIEW

the values f(Xi). After after each guess we are told whether it was rightor wrong. The objective is to minimize the number of wrong guesses. Oneassumes there is an unknown function f belonging to some known restrictedclass of functions.

As a concrete example, suppose there is a fixed unknown vector ~a ∈ Rn,

and our function f is defined by

f(~x) =

{

True if ~a · ~x ≥ 0False if ~a · ~x < 0

Assume the right answer has components ai ∈ {−2b, ..., 2b} ⊂ Z. Considerthe following algorithm. At each iteration, choose a random ~a from thosethat have made no mistakes so far, and use that to make the next guess.

If, on every step, we pick the answer according the majority vote of those~a which have made no mistake so far, then every mistake would cut downthe field of survivors by at least a factor of 2. As there are 2b(n+1) candidatesat the outset, you would make at most (b + 1)n mistakes.

Exercise 1. Show that for the randomized algorithm above,

E(number of mistakes) ≤ 2(b + 1)n.

Page 7: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

Chapter 2

The Brunn-Minkowski

Inequality

In this lecture, we will prove a fundamental geometric inequality – theBrunn-Minkowski inequality. This inequality relates the volumes of setsin high-dimensional spaces under convex combinations. Let us first recallthe definition of convexity.

Definition 1. Let K ⊆ Rn. K is a convex set if for any two points x, y ∈ K,

and any 0 ≤ λ ≤ 1, λx + (1 − λ)y ∈ K.

To motivate the inequality, consider the following version of cutting a(convex) cake: you pick a point x on the cake, your brother makes a singleknife cut and you get the piece that contains x. A natural choice for x isthe centroid. For a convex set K, it is

x =1

Vol(K)

y∈Ky dy.

What is the minimum fraction of the cake that you are guaranteed to get?For convenience, let K be a convex body whose centroid is the origin.

Let u ∈ Rn be the normal vector defining the following halfspaces:

H1 = {v ∈ Rn : u · v ≥ 0}

H2 = {v ∈ Rn : u · v < 0}

Now we can consider the two portions that u cuts out of K:

K1 = H1 ∩ K

K2 = H2 ∩ K

7

Page 8: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

8 CHAPTER 2. THE BRUNN-MINKOWSKI INEQUALITY

We would like to compare Vol(K1) and Vol(K2) with Vol(K).Consider first one dimension – a convex body in one dimension is just a

segment on the real number line. It’s clear that any cut through the centroidof the segment (i.e. the center) divides the area of the segment into two sidesof exactly half the area of the segment.

For two dimensions, this is already a non-trivial problem. Let’s consideran isosceles triangle, whose side of unique length is perpendicular to the xaxis. If we make a cut through the centroid perpendicular to the x axis, itis readily checked that the volume of the smaller side is 4

9 ’ths of the totalvolume. Is this the least possible in R

2? What about in Rn?

The Brunn-Minkowski inequality will be very useful in answering thesequestions.

2.1 The inequality

We first define the Minkowski sum of two sets:

Definition 2. Let A, B ⊆ Rn. The Minkowski sum of A and B is the set

A + B = {x + y : x ∈ A, y ∈ B}

How is the volume of A + B related to the volume of A or B? TheBrunn-Minkowski inequality relates these quantities.

Theorem 3 (Brunn-Minkowski). Let 0 ≤ λ ≤ 1, and suppose that A, B,andλA + (1 − λ)B are measurable subsets of R

n. Then,

Vol(λA + (1 − λ)B)1/n ≥ λVol(A)1/n + (1 − λ)Vol(B)1/n.

Recall that for a measurable set A and a scaling factor λ, we have that:

Vol(λ(A)) = λnVol(A).

It follows that an equivalent statement of the inequality is the following: formeasurable sets A, B and A + B over R

n:

Vol(A + B)1/n ≥ Vol(A)1/n + Vol(B)1/n.

2.1.1 Proving the Brunn-Minkowski inequality

For some intuition, let’s first consider the Brunn-Minkowski inequality whenA and B are axis-aligned cuboids in R

n. A cuboid in Rn is a generalization

Page 9: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

2.1. THE INEQUALITY 9

of the familiar rectangle in two dimensions. An axis-aligned cuboid withside lengths (a1, a2, . . . an) is the set

A = {x ∈ Rn : li ≤ xi ≤ li + ai}

for l = (l1, . . . , ln) ∈ Rn.

Cuboids

Let A be a cuboid with side lengths (a1, a2, . . . , an) and B be a cuboid withside lengths (b1, b2, . . . , bn). Let us prove the Brunn-Minkowski inequalityfor A and B.

This proof will follow easily because if A and B are cuboids, then A+Bis a cuboid with side lengths (a1 + b1, a2 + b2, . . . an + bn). Since these arecuboids, it is easy to compute their volumes:

Vol(A) =n∏

i=1

ai, Vol(B) =n∏

i=1

bi, Vol(A + B) =n∏

i=1

ai + bi

We want to show that

Vol(A + B)1/n ≥ Vol(A)1/n + Vol(B)1/n.

To this end, consider the ratio between the volumes

Vol(A)1/n + Vol(B)1/n

Vol(A + B)1/n=

(∏n

i=1 ai)1/n + (

∏ni=1 bi)

1/n

(∏n

i=1 ai + bi)1/n

=

(

n∏

i=1

ai

ai + bi

)1/n

+

(

n∏

i=1

bi

ai + bi

)1/n

≤n∑

i=1

ai

ai + bi+

n∑

i=1

bi

ai + bi

= 1

Which proves our claim; the inequality used is the standard inequality be-tween the geometric and the arithmetic mean.

Now that we have the result for cuboids, how can we generalize thisto arbitrary measurable sets? The key is that any measurable set can beapproximated arbitrarily well with unions of cuboids. We will prove thatif A ∪ B is the union of a finite number of cuboids, the Brunn-Minkowskiinequality holds. We can prove the general result by approximating arbitrarymeasurable sets with unions of cuboids.

Page 10: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

10 CHAPTER 2. THE BRUNN-MINKOWSKI INEQUALITY

Finite unions of cuboids

We prove the Brunn-Minkowski inequality by induction on the number ofdisjoint cuboids that A∪B contain. Note that our result for A and B beingcuboids is the base case.

We first translate A so that there exists a cuboid in A fully contained inthe halfspace {x : x1 ≥ 0} and there exists a cuboid in A fully contained in{x : x1 < 0}. Now, translate B so that the following inequality holds:

Vol(A+)

Vol(A)=

Vol(B+)

Vol(B)

where:A+ = A ∩ {x : x1 ≥ 0}, A− = A \ A+

B+ = B ∩ {x : x1 ≥ 0}, B− = B \ B+

Note that we can now proceed by induction on the sets A+ +B+, A− +B−,since the number of cuboids in A+ ∪ B+ and A− ∪ B− is fewer than thenumber of cuboids in A ∪ B. This allows us to complete the proof:

Vol(A + B) ≥ Vol(A+ + B+) + Vol(A− + B−)

≥(

Vol(A+)1/n + Vol(B+)1/n)n

+(

Vol(A−)1/n + Vol(B−)1/n)n

= Vol(A+)

(

1 +

(

Vol(B)

Vol(A)

)1/n)n

+ Vol(A−)

(

1 +Vol(B)

Vol(A)

1/n)n

= Vol(A)

(

1 +

(

Vol(B)

Vol(A)

)1/n)n

The first inequality above follows from the observation that since A+, B+

are separated from A−, B− by a halfspace, A++B+ is dijoint from A−+B−.It follows that

Vol(A + B)1/n ≥ Vol(A)1/n + Vol(B)1/n

which was our desired result.

2.2 Grunbaum’s inequality

We are now ready to go back to our original question: what is the smallestvolume ratio attainable for the smaller half of a cut through the centroidof any convex body? Somewhat surprisingly, the generalization of the twodimensional case we covered in the introduction is the worst case in highdimensions.

Page 11: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

2.2. GRUNBAUM’S INEQUALITY 11

Theorem 4 (Grunbaum’s inequality). Let K be a convex body. Then anyhalf-space defined by a normal vector v that contains the centroid of K con-tains at least a 1

e fraction of the volume of K.

The overview of the proof is that we will consider the body K and ahalfspace defined by v going through the centroid. We will simplify K byperforming a symmetrization step – replacing each (n− 1)-dimensional sliceS of K perpendicular to v with an (n − 1)-dimensional ball of the samevolume as S to obtain a new convex body K ′. We will verify the convexityof K ′ by using the Brunn-Minkowski inequality. This does not modify thefraction of volume contained in the halfspace. Next, we will construct a conefrom K ′. The cone will have the property that the fraction of volume in thehalfspace can only decrease, so it will still be a lower bound on K. Thenwe will easily calculate the ratio of the volume of a halfspace defined by anormal vector v cutting through the centroid of a cone, which will give usthe desired result.

The two steps of symmetrization and conification can be seen in figure2.1.

Figure 2.1: The first diagram shows the symmetrization step – each slice ofK is replaced with an (n − 1) dimensional ball. The second diagram showshow we construct a cone from K ′.

Proof. Assume without loss of generality that v is the x1 axis, and that thecentroid of K is the origin. From the convex body K, we construct a newset K ′. For all α ∈ R, we place an (n− 1) dimensional ball centered aroundc = (α, 0, . . . 0) on the last (n − 1) coordinates with radius rα such that

Vol(B(c, rα)) = Vol(K ∩ {x : x1 = α}).

Page 12: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

12 CHAPTER 2. THE BRUNN-MINKOWSKI INEQUALITY

See figure 2.1 for a visual of the construction. Recall that the volume of an− 1 dimensional ball of radius r is f(n− 1)rn−1 (where f is a function notdepending on r and only on n), so that we set

rα =

(

Vol(K ∩ {x : x1 = α})f(n − 1)

)1/(n−1)

We first argue that K ′ is convex. Consider the function r : R → R, definedas r(α) = rα. If r is a concave function (i.e. ∀x, y ∈ R, 0 ≤ λ ≤ 1, r(λx +(1 − λ)y) ≥ λr(x) + (1 − λ)r(y)), then K ′ is convex.

Note that rα is proportional to Vol(K ∩ {x : x1 = α})1/(n−1). LetA = K ∩ {x : x1 = a}, and let B = K ∩ {x : x1 = b}. For some 0 ≤ λ ≤ 1,let C = K ∩ {x : x1 = λa + (1 − λ)b}. r is concave if

Vol(C)1/n−1 ≥ λVol(A)1/n−1 + (1 − λ)Vol(B)1/n−1 (2.1)

It is easily checked that

C = λA + (1 − λ)B

so that (2.1) holds by the Brunn-Minkowski inequality, and K ′ is convex.Let

K ′+ = K ′ ∩ {x : x1 ≥ 0}

K ′− = K ′ ∩ {x : x1 < 0}.

Then the ratio of volumes on either side of x1 = 0 has not changed, i.e.

Vol(K ′+) = Vol(K ∩ {x : x1 ≥ 0})

Vol(K ′−) = Vol(K ∩ {x : x1 < 0}).

We argue thatVol(K ′

+)

Vol(K ′)≥ 1

e

and the other side will follow by symmetry. To this end, we transform K ′

into a cone (see figure 2.1). To be precise, we replace K ′+ with a cone C of

equal volume with the same base as K ′+. We replace K ′

− with the extensionE of the cone C, such that Vol(E) = Vol(K ′

−). The centroid of K ′ was atthe origin; the centroid of C ∪ E can only be pushed to the right along x1,because r is concave. Let the position of the centroid of C ∪ E along x1 beα ≥ 0. Then we have that

Vol(K ′+)

Vol(K ′)≥ Vol((C ∪ E) ∩ {x : x1 ≥ α})

Vol(C ∪ E)

Page 13: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

2.2. GRUNBAUM’S INEQUALITY 13

so a lower bound on the latter ratio implies a lower bound onVol(K′

+)

Vol(K′) . Tothis end, we compute the ratio of volume on the right side of a halfspacecutting through the centroid of a cone.

Let the height of the cone C ∪E be h, and the length of the base be R.The centroid of the cone is at position α along x1 given by:

1

Vol(C ∪ E)

∫ h

t=0tf(n − 1)

(

tR

h

)n−1

dt =f(n − 1)

Vol(C ∪ E)

(

R

h

)n−1 ∫ h

t=0tndt

=f(n − 1)

Vol(C ∪ E)· Rn−1h2

n + 1

where f(·) is the function independent of r such that f(n)rn gives the volumeof an n dimensional ball with radius r. By noting that, for a cone,

Vol(C ∪ E) =f(n − 1)Rn−1h

n

we have that α = nn+1h. Now, plugging in to compute the volume of C (the

right half of the cone), we have our desired result.

Vol(C) =

∫ n

n+1h

t=0f(n − 1)

(

tR

h

)n−1

dt

=Rn−1f(n − 1)

hn−1

∫ n

n+1h

t=0tn−1dt

=

(

n

n + 1

)n

Vol(C ∪ E)

≥ 1

eVol(C ∪ E)

Page 14: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

14 CHAPTER 2. THE BRUNN-MINKOWSKI INEQUALITY

Page 15: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

Chapter 3

Convex Optimization

3.1 Optimization in Euclidean space

Let S ⊂ Rn, and let f : S → R be a real-valued function. The problem of

interest may be stated as

min f(x)x ∈ S,

(3.1)

that is, find the point x ∈ S which minimizes by f . For convenience, we de-note by x∗ a solution for the problem (3.1) and by f∗ = f(x∗) the associatedcost.

Restricting the set S and the function f to be convex1, we obtain a classof problems which are polynomial-time solvable in

n and log

(

1

ǫ

)

,

where ǫ defines an optimality criterion such as

‖x − x∗‖ ≤ ǫ. (3.2)

Linear programming (LP) is one important subclass of this family ofproblems. Let A : R

n → Rm be a linear transformation, b ∈ R

m and c ∈ Rn.

We state a LP problem as

minx

cT x

Ax ≥ b.(3.3)

1In fact, one could easily extend to the case of quasi-convex f .

15

Page 16: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

16 CHAPTER 3. CONVEX OPTIMIZATION

Here, the feasible set is a polyhedron, for many CO problems it is alsobounded, i.e., it is a polytope.

Here we illustrate other classical examples.

Example 5. Minimum weight perfect matching problem; given graph G =(V, E), with costs ω : E → R, find

min∑

e∈E cexe∑

e∈δ(v) xe = 1, for all v ∈ V

xe ∈ {0, 1}, for all e ∈ E,

(3.4)

where δ(v) = {e = (i, j) ∈ E : v = i or v = j}. Due to the last set ofconstraints, the so called integrality constraints, (3.4) is not a LP.

The previous problem had integrality constraints which usually increasethe difficulty of the problem. One possible approach is to solve a linearrelaxation to get (at least) bounds for the original problem. In our particularcase,

min∑

e∈E cexe∑

e∈δ(v) xe = 1, for all v ∈ V

0 ≤ xe ≤ 1, for all e ∈ E.

(3.5)

This problem is clearly an LP, and thus, it can be efficiently solved. Un-fortunately, the solution for the LP might not be interesting for the originalproblem (3.4). For example, it is easy to see that the problem (3.4) is infea-sible when |V | = 3, while the linear relaxation is not2.

Surprisingly, it is known that the graph G is bipartite if and only if eachvertex of the polytope defined in (3.5) is a feasible solution for (3.4). Thus,for this particular case, the optimal cost coincide for both problems.

For general graphs, we need to add another family of constraints in orderto obtain the previous equivalence3.

e∈(S,Sc)

xe ≥ 1, for all S ⊂ V, |S| odd. (3.6)

Note that there are an exponential number of constraints in this family.Thus, it is not obvious that it is polynomial-time solvable.

Example 6. Another convex optimization problem when a polyhedra is in-tersected with a ball of radius R, i.e., the feasible set is nonlinear.

2Just let x = (0.5, 0.5, 0.5).3The proof is due to Jack Edmonds.

Page 17: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

3.1. OPTIMIZATION IN EUCLIDEAN SPACE 17

min cT xAx ≥ b∑n

i=1 x2i ≤ R2.

(3.7)

Example 7. Semi-definite programming (SDP). Here the variable, insteadof being a vector in R

n, is constrained to being a semi-definite positive ma-trix.

min C · XA · X ≤ bi, for i = 1, . . . , mX < 0,

(3.8)

where A · B =∑n

i=1

∑nj=1 AijBij denotes a generalized inner product, and

the last constraints represent

∀y ∈ Rn, yT Xy ≥ 0

which defines an infinite number of linear constraints.

One might notice that it is impossible to explicitly store all constraintsin (3.6) or (7). It turns out that there is an extremely convenient way todescribe the input of a convex programming problem in an implicit way.

Definition 8. A separation oracle for a convex set K ⊂ Rn is an algorithm

that for each x ∈ Rn, either states that x ∈ K, or provides a vector a ∈ R

n

such that

aT x < b and aT y ≥ b for all y ∈ K

Exercise Prove that for any convex set, there exists a separation oracle.

Returning to our examples,

5. All that is needed is to check each constraint and return a violatedconstraint if it is the case. In the complete formulation, finding aviolating inequality can be done in polynomial time but is nontrivial.

6. A hyperplane tangent to the ball of radius R can be easily constructed.

7. Given X̄, the separation oracle reduces to finding a vector v such thatX̄v = −λv for some positive scalar λ. This would imply that

vT X̄v = λ ⇒ X̄ · (vvT ) = −λ,

while X · (vvT ) ≥ 0 for all feasible points.

Page 18: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

18 CHAPTER 3. CONVEX OPTIMIZATION

3.1.1 Reducing Optimization to Feasibility

Before proceeding, we will reduce the problem (3.1), under the convexityassumptions previous mentioned, to a feasibility problem. That is, given aconvex set K, find a point on K or prove that K is empty (“give a certificate”that K is empty).

For a scalar parameter t, we define the convex set

K ∩ {x ∈ Rn : f(x) ≤ t}

If we can find a point in such a convex set, then we can solve (3.1) via binarysearch for the optimal t.

3.1.2 Feasibility Problem

Now, we are going to investigate a convex feasibility algorithm with thefollowing rules:

Input: separate oracle for K, and r, R ∈ R++

(where a ball of radius r contained in K, K is contained in a ball of radius R)Output: A point in K if K is not empty, or give a certificate that K is empty.

Theorem 9. Any algorithm needs n log2Rr oracle calls in the worst case.

Proof. Since we only have access to the oracle, the set K may be in thelargest remaining set at each iteration. In this case, we can reduce the totalvolume at most by 1/2. It may happen for iteration k as long,

(

1

2

)k

<( r

R

)n

which implies that

k < n log2

R

r

In the seminal work of Khachiyan and its extension by GLS, the EllipsoidAlgorithm was proved to require O(n2 log R

r ) iterations for the convex feasi-bility problem. Subsequently, Karmarkar introduced interior points methodsfor the special case of linear programming; the latter turned out to be morepractical (and polynomial-time).

We are going to consider the following algorithm for convex feasibility.

Page 19: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

3.1. OPTIMIZATION IN EUCLIDEAN SPACE 19

1. P = cube(R), z = 0.2. Call Separation Oracle for z. If z ∈ K stop.3. P := P ∩ {x ∈ R

n : aT x ≤ aT z}4. Define the new point z ∈ P .5. Repeat steps 2, 3, and 4 N times6. Declare that K is empty.

Two of the steps are not completely defined. It is intuitive that the choiceof the points z will define the number of iterations needed, N . Many choicesare possible here, e.g., the centroid, a random point, the average of randompoints, analytic center, point at maximum distance from the boundary, etc.

For instance, consider the centroid and an arbitrary hyperplane crossingit. As we proved in the previous lecture, P will be divided into two convexbodies, each having at least 1

e of the total volume of the starting set (Grun-baum’s theorem). Since the initial volume is Rn and the final volume is atleast rn, we need at most

log1− 1

e

(

R

r

)n

= O(

n logR

r

)

iterations.So, we have reduced convex optimization to convex feasibility, and the

latter we reduced to finding centroid. Unfortunately, the problem of findingthe centroid is #P-hard.

Next, let us consider z to be defined as a random point from P . Forsimplicity, consider the case where P is a ball of radius R centered at theorigin. We would like z to be close to the center, so it is of interest toestimate its expected norm.

E[‖x‖] =

∫ R

0

Vol(Sn)tn−1

Vol(Bn)Rntdt (3.9)

=n

Rn

∫ R

0tndt =

n

n + 1R. (3.10)

That is, a random point in the ball is close to the boundary. Now, weproceed to estimate how much the volume is decreasing at each iteration.

Defining b = E[‖x‖] = nn+1R, we obtain a ≤ R

2n+1 . This implies that

the volume is falling by(

2

n + 1

)n

,

Page 20: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

20 CHAPTER 3. CONVEX OPTIMIZATION

Figure 3.1: Sphere cap.

which is decreasing exponentially in n. Thus, it will not give us a polynomial-time algorithm.

Now, consider the average of random points,

z =1

m

m∑

i=1

yi,

where yi’s are i.i.d. points in P .

We will prove shortly the following theorem.

Theorem 10. Let z and {yi}mi=1 be defined as above, we have

E[Vol(P ′)] ≤(

1 − 1

e+

n

m

)

Vol(P ).

Thus, the following corollary will hold.

Page 21: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

3.1. OPTIMIZATION IN EUCLIDEAN SPACE 21

Corollary 11. With m = 10n, the algorithm finishes in O(n log Rr ) itera-

tions with high probability.

We will begin to prove Theorem (10) for balls centered at the origin.This implies that E[z] = 0 and, using the radial symmetry,

var(‖z‖) ≤ E[‖z‖2] =1

mE[|yi|2]

=1

m

∫ R

0

Vol(Sn)tn−1

Vol(Bn)Rnt2dt

=n

mRn

∫ R

0tn+1dt =

n

m(n + 2)R2.

To conclude the proof for this special case, using the inequality

P

(

‖z‖ >R√n

)

≤ V ar(z)n

R2=

nR2

m(n + 2)

n

R2≤ n

3m, (3.11)

using m = O(n), with a fixed probability, we obtain ‖z‖ ≤ R√n. Now b = R√

n,

thus,

a2 =(

R + R√n

)(

R − R√n

)

= R2(

1 − 1n

)

. (3.12)

This implies that the volume is falling by

(

1 − 1

n

)n/2

≥ 1√e.

To extend the previous result to an ellipsoid E, it is enough to observethat if A : R

n → Rn is an affine linear transformation, E = AB, which

scales the volume of every subset by the same factor, namely, |det(A)|.To generalize for arbitrary convex sets, we introduce the following con-

cept.

Definition 12. A set S is in isotropic position if for a random point x ∈ S,

1. E[x] = 0;

2. ∀v ∈ Rn, E[(vT x)2] = ‖v‖2.

Next, we are going to prove a lemma which proves an equivalence be-tween the second condition of isotropic position and conditions over thevariance-covariance matrix of the random vector x.

Page 22: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

22 CHAPTER 3. CONVEX OPTIMIZATION

Lemma 13. Let x be a random point in a convex set S, we have ∀v ∈ Rn,

E[(vT x)2] = ‖v‖2 if and only if E[xxT ] = I.

Proof. If E[(vT x)2] = ‖v‖2, for v = ei we have

E[(eTi x)2] = E[x2

i ] = 1.

For v := (v1, v2, 0, . . . , 0),

E[(vT x)2] = E[v21x

21 + v2

2x22 + 2v1v2x1x2]

= v21 + v2

2 + 2v1v2E[x1x2]

= v21 + v2

2,

where the last equality follows from the second condition of Definition 12.So, E[x1x2] = 0.

To prove the other direction,

E[(vT x)2] = E[(vT xxT v)] = vTE[xxT ]v = vT v = ‖v‖2.

It is trivial to see that the first condition of isotropic position can beachieved by a translation. It turns out that is possible to obtain the secondcondition by means of a linear transformation.

Lemma 14. Any convex set can be put in isotropic position by an affinetransformation.

Proof. EK [xxT ] = A is positive definite. Then, we can write A = B2 (orBBT ).

Defining, y = B−1x, we get

E[yyT ] = B−1E[xxT ](B−1)T = I.

The new set is defined as

K ′ := B−1(K − z(K)),

where z(K) := E[x], the centroid of K.

Note that the argument used before to relate the proof for balls to ellip-soids states that volume ratios are invariant under affine transformations.Thus, we can restrict ourselves to the case where the convex set is in isotropicposition due to the previous lemma.

Page 23: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

3.1. OPTIMIZATION IN EUCLIDEAN SPACE 23

Lemma 15. K is a convex body, z is the average of m random points fromK. If H is a half space containing z,

E[Vol(H ∩ K)] ≥(

1

e−√

n

m

)

Vol(K).

Proof. As state before, we can restrict ourselves to the case which K is inisotropic position. Since z = 1

m

∑mi=1 yi,

E[‖z‖2] =1

m2

m∑

i=1

E[‖yi‖2] =1

mE[‖yi‖2]

=1

m

n∑

j=1

E[(yij)

2] =n

m,

where the first equality follows from the independence between yi’s, andequalities of the second line follows from the isotropic position.

Let h be a unit vector normal to H, since the isotropic position is in-variant under rotations, we can assume that h = e1 = (1, 0, . . . , 0).

Define the (n−1)-volume of the slice K∩{x ∈ Rn : x1 = y} as Voln−1(K∩

{x ∈ Rn : x1 = y}). We can also define the following density function,

fK(y) =Voln−1(K ∩ {x ∈ R

n : x1 = y})Vol(K)

. (3.13)

Note that this function has the following properties:

1. fK(y) ≥ 0 for all y ∈ R;

2.

R

fK(y)dy = 1;

3.

R

yfK(y)dy = 0;

4.

R

y2fK(y)dy = 1;

The following technical lemma will give an useful bound to this density(we defer its proof).

Lemma 16. If fK satisfies the previous properties for some convex body K,

maxy

f(y) ≤ n

n + 1

n

n + 2< 1

Page 24: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

24 CHAPTER 3. CONVEX OPTIMIZATION

Combining this lemma with the bound on the norm of z,

Vol(K ∩ {xRn : x1 ≥ z1}

Vol(K)=

∫ ∞

z1

fK(y)dy

=

∫ ∞

0fK(y)dy −

∫ z1

0fK(y)dy

≥ 1

e−∫ z1

0max

yfK(y)dy

≥ 1

e− ‖z‖

≥ 1

e−√

n

m.

(3.14)

Proof. (of Lemma 16) Let K be a convex body in isotropic position forwhich maxy fK(y) is as large as possible. In Figure 3.1.2, we denote byy∗ ∈ arg maxy fK(y), the centroid is zero, and the set K is the black line.

In Figure 3.1.2, the transformation in part I consists of replacing it witha cone whose base is the cross-section at zero and its apex is on the x1 axis.The distance of the apex is defined to preserve the volume of part I. Such atransformation can only move mass to the left of the centroid.

The transformation in part II of figure 3.1.2 is to construct the convexhull between the cross sections at zero and y∗. This procedure may losesome mass (never increases). Part III is also replaced by a cone, with basebeing the cross-section at y∗ and the distance of the apex is such that wemaintain the total volume. Thus, the transformations in parts II and IIImove mass only to the right, i.e., away from the centroid.

Defining the moment of inertia4, as I(K) =∫

K(y − EK [y])2fK(y)dy, wemake the following claim.

Claim 17. Moving mass away from the centroid of a set can only increasethe moment of inertia I.

4Note that EK [y] = 0 since K is in isotropic position.

Page 25: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

3.1. OPTIMIZATION IN EUCLIDEAN SPACE 25

Figure 3.2: Construction to increase the moment of inertia I.

Let us define a transformation C → C ′, x1 → x1+g(x1), where x1g(x1) ≥ 0.

I(C) = V arC(y) = EC [y2]

I(C ′) = V arC′(y) = EC′ [y2] − (EC′ [y])2

= EC [(y + g(y))2] − (EC [y + g(y)])2

= V arC(y) + V arC(g(y)) + 2EC [yg(y)]

≥ I(C).

In fact, we have strict inequality if g(y) is nonzero on a set of positivemeasure. In this case, we can shrink the support of fK by a factor strictlysmaller that one, and scale up fK by a factor greater than one to get a setwith moment of inertia equal to 1. In this process max fK has increased.This is a contradiction since we started with maximum possible. Thus, Kmust have the shape of K ′.

Now, we proceed for the transformation in Figure 3.1.2.

Page 26: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

26 CHAPTER 3. CONVEX OPTIMIZATION

Figure 3.3: Construction to reduce to a double cone.

In Figure 3.1.2, y is chosen to preserve the volume in Part I, whilePart II is not altered. Note that all the mass removed from the segmentbetween [−y, y] is replaced at a distance to zero greater than y. Thus, themoment of inertia increases. By the same argument used before, we obtaina contradiction. Hence, K must be a double cone.

Once we obtain a double cone, we can move y∗ to the last point to theright, and build a cone with the same volume which has a greater momentof inertia. So K must be this cone with length h and with base area A at

y∗. Then, it is an easy exercise in integration to get h = (n+1)√

n+2n . Since

Vol(K) = Ahn ,

f(y∗) =A

Vol(K)=

n

h=

n

n + 1

n

n + 2< 1.

Page 27: Algorithmic Convex Geometryvempala/hda/notes.pdf · The first is the theory of geometric inequalities. We begin with classical topics such as the Brunn-Minkowski inequality, and

3.1. OPTIMIZATION IN EUCLIDEAN SPACE 27

3.1.3 Membership Oracle

We have reduced quasi-convex optimization over convex sets to a feasibil-ity problem and solved the latter using a polynomial number of calls to aseparation oracle.

We note that with only a membership oracle, given an arbitrary pointx in the space, it will return a certificate that x ∈ K or a certificate thatx /∈ K, and a point x ∈ K such that x+rB ⊆ K ⊆ x+RB, the optimizationproblem can still be solved using a polynomial number of oracle calls. Theidea is to use the oracle to generate random samples (this part needs only amembership oracle), then restrict the set using the objective function valuesat the sample points. The first polynomial algorithm in this setting wasgiven by GLS using a variant of the Ellipsoid algorithm.