Download - The concept of duality in Asymptotic Geometric Analysis: The Legendre Transform.

The concept of duality in Asymptotic Geometric

Analysis:

The Legendre Transform.

Shiri Artstein-Avidan and Vitali Milman

Tel Aviv University.

Adrien-Marie Legendre1752-1833

The Legendre transform :

This is an involution on the class of convex lower semi-continuous

functions on Rn

For a function f : Rn R let

(Lf()x = )sup { <x,y>- f)y( } y

U ±{∞}

Fix a scalar product <·, ·> on Rn

Define: Cvx(Rn) is the set of lower semi continuous convex functions on Rn with values

possibly infinite .

Example: For f(x) = /2 , we have (Lf)(x) = /2

2|||| Kx2|||| K

x

Most simple Examples: functions which are infinity everywhere but one point ; linear functions.

Abstract duality concept:

1 .For all f, T(T(f)) = f

2 .For all f < g one has T(f)> T(g)

We illustrate in this talk how the concrete formula for the Legendre transform can be obtained directly from

“abstract duality’’

(Notice that condition 1 implies that T is 1-1 and onto)

Theorem 1: Any T: Cvx(Rn) Cvx(Rn) satisfying that

( 1 )For all f, T(T(f))=f

( 2 )For all f < g one has T(f) > T(g)must satisfy for some symmetric B in GLn , v0 in Rn, and C0 in R, that

( Tf()x = )C0 + <v0 , x> + )Lf( )Bx+v0(. Remark: <v0 , x> versus (Lf)(x+v0)…

Remark: <v0 , x> versus (Lf)(x+v0)…

Notice that:

(L(f+<•, v>)()x )

= sup <x, y> - f(y)-<y,v<

= sup <x-v, y> -f(y)( = Lf()x-v .)

Theorem 1’: Any 1-1 and onto T: Cvx(Rn) Cvx(Rn) satisfying that

( 1 )For all f < g one has T(f) > T(g)

( 2 )For all T(f) < T(g) one has f > gmust satisfy for some fixed B in GLn , v0 and v1 in Rn, C1 positive and C0 in R, that

( Tf()x = )C0+<v0 , x>+C1)Lf()Bx+v1(.

Remark: The requirement of onto is important :

Consider the transformation T given by

T(f) = L (f + x2)It is order reversing (not an involution, of course) 1-1 but not onto.Remark: compare with Böröczky-Schneider

The sketch of the proof we will see consists of three parts:

(1“ )Min” and “Max” are interchanged

(2 )It is enough to know what happens to delta functions, or linear functions

(3 )Because of order-reversion, and the special properties of these “extreme functions”, we can determine their behavior .

First step:

If T is 1-1 and onto and satisfies

1 .f < g implies T(f) > T(g) 2 .T(T(f)) = f

then, T(min(f,g)) = max(T(f), T(g))and, T(max(f,g)) = min (T(f), T(g))Where “min” should be understood as regularized

minimum .

The proof is quite simple :

Since min(f,g)< f, g we have T(min(f,g)) ≥ T(f) , T(g) , so that T(min(f,g)) ≥ max(T(f), T(g)).

Secondly, max(T(f), T(g)) = T(h) for some h, in fact, for h= T(max(T(f), T(g))). Thus T(h) ≤ T(min(f,g)) so h ≥ min (f,g) .

But then again h ≤ f , and h ≤ g , so h ≤ min(f,g) and we get equality .

In fact, we may show a more general fact:

If T is 1-1 and onto and satisfies

1 .f<g implies T(f)>T(g) 2 .T(f) < T(g) implies f >g

then, T(min(f,g)) = max(T(f), T(g))and, T(max(f,g)) = min (T(f), T(g))where “min” should be understood as regularized

minimum .

Notation: the delta functions Dx :

Dx(x)= 0, and infinity elsewhere

Claim: If we know what happens to Dx+c for all x in Rn and c in R, we know the form of the

transform .

Proof: f(x) = inf (Dy(x) + f(y))

and so (Tf)(x) = sup T(Dy+f(y))(x)

y

y

Notation: linear functions hx :

lu(y)= <u,y<Claim: If we know what happens to hx+c for all x in Rn, we know the form of the

transform .

Proof: f(x) = sup (hu(x) + c(u))

and so (Tf)(x) = inf T(hu+c(u))(x)

)here inf is regularized infimum(

u

u

To find out what is the image of the delta functions (and/or the linear functions)

Let us notice a few facts about these “elementary functions”

Delta functions

x

Dx

…if f> Dx , then f=Dx+c for some

c>0…

x Dx+c’…

…same true for the functions

Linear functions

hu=<x, u<

…if f<hu , then f=hu-c for some

c>0…

u

…same true for the functions

hu+c’…

u

Let T(hu)= w. (for hu = <·,u>) If f >w and g>w , then T(f)<

T(w) = hu , and T(g)<T(w) = hu . This means T(f) and T(g) are

both linear:T(f)= hu-c and T(g)= hu-c .’

In particular: either T(f)>T(g) or T(g)>T(f)Therefore: either f<g or g<f.

Any two functions f>w and g>w are comparable .

We see that: (letting w =T(hu))

(Notice: this is true for w=Dx). Second: If the “support” of w includes two points x and y, we may build two non-comparable

functions f>w , g>w :

x yw(x

)

w(y)

Conclude: The “support” of w is just one point, and so for every vector u we have that

T(hu) = Dx+ c

for some vector x and some constant c .

(In fact, for every vector u and constant cT(hu+ c) = Dx+ c ’

for some vector x and some constant c).’

We may similarly show that for every x and c there are u and c’ with T(Dx + c) = hu+ c’ u(x

)c’(x,

c)

Next we establish the linearity of this relation, namely we show :

There exists some symmetric B in GLn , a vector v0 in Rn and a constant c0 such that for every vector x and constant c we have thatT(Dx+c) = hBx+v0+ <x,v0> — c + c0

u(x)

c’(x,c)

)Tf)(x = (sup T(Dy+f(y))(x)This would complete the proof since:

T(Dx+c) = hBx+v0 +<x,v0> — c + c0

)Tf)( x = (sup (<y,v0>+hBy+v0 (x)— f(y) + c0) = c0+ sup (<y,v0> +<x, By+v0>— f(y)) = c0+ <x, v0> + sup (<v0+Bx,y> — f(y))

= c0+ <x, v0>+(Lf )(v0+Bx)

And thus:

we know

The linearity of the correspondence

is established as follows:

T(Dx+c) = hBx+v0 +<x,v0> — c + c0

Fact: if F: Rm Rm, 1-1 and onto, satisfies that for every interval [x,y] we have that F([x,y]) is also an interval, then F is affine linear, namely

F(x) = Ax+vfor some fixed A in GLm and v in Rm.

Any (z,c) inside this interval satifies that

Dz+c ≥ “min” (Dx+c1, Dy+c2)

And that the same is not true for (Dz+c’) for any c’<c .

Define the mapping F : Rn+1 Rn+1 by

F((x,c)) = (u,c’)Where T(Dx+c) = hu+c ’

Consider the interval [(x,c1),(y,c2)] .

Consider the interval [(x,c1),(y,c2)] .

Any (z,c) inside this interval satisfies that

Dz+c ≥ “min” (Dx+c1, Dy+c2)

And that the same is not true for (Dz+c’) for any c’<c .

Letting F(x,c1) = (u,c’) and F(y,c2)=(v,c’’):

( hw+c)=’’’ T(Dz+c) ≤ max (hu+c’, hv+c’’)

And the same is not true for (Dz+c’) for any c’<c .

hu+c’

hv+c’’

(notice that T(Dz+c) are all parallel)

So, T(Dz+c) = hw+c’’’

with w in [u,v]

~

F(x,c) = A(x,c)+vfor some fixed A in GLn+1 and v in Rn+1.

Conclude: the mapping F given by F((x,c)) = (u,c’), where T(Dx+c) = hu+c’ is mapping intervals to intervals and so, by the fact stated before, must be affine linear:

Finally, we should show that A and v are in fact composed of a symmetric B in GLn and a vector v0 in Rn as follows :

A(x,c)+v = ( Bx+v0 , <v0 ,x> — c+c0)

u(x)

c’(x,c)

This is not difficult.. (show that: first coordinate doesn’t depend on c, second coordinate’s dependency on c is involutive, relation between A’s last row and v’s entries: again, from involution)

So that T(Dx+c) = hBx+v0 +<v0 , x> —c+c0

A(x,c)+v = ( Bx+v0 , <v0 ,x> —c+c0)

End of Proof .

We have sketched the proof of:Theorem 1: Any T: Cvx(Rn) Cvx(Rn) satisfying that


( 2 )For all f < g one has T(f) > T(g)must satisfy for some symmetric B in GLn , v0 in Rn, and C0 in R, that

( Tf()x = )C0 + <v0 , x> + )Lf( )Bx+v0(.

Theorem 2:

Any T: Log-Conc(Rn) Log-Conc(Rn) satisfying that


( 2 )For all f < g one has T(f) > T(g)must satisfy for some symmetric B in GLn , v0 in Rn, and C0>0, that

( Tf()x = )C0 e-<v0,x> inf e-<Bx+v0,y>/f)y(. (first defined in A-Klartag-

M)

Theorem (Böröczky-Schneider) (using Grüber): For n ≥ 2, any T: K0(Rn) K0(Rn) satisfying that

( 1 )For all K, T(T(K))=K

( 2 )For all K1 K2 one has T(K2) T(K1)

must satisfy for some symmetric B in GLn that

T(K) = BK0

Theorem 3: Any T: K(Rn) K(Rn) satisfying that

( 1 )For all K, T(T(K))=K

( 2 )For all K1 K2 one has T(K2) T(K1)

must satisfy for some symmetric B in GLn that

T(K) = BK0

Theorem 4: Any T: Concs(Rn) Concs(Rn) satisfying that


( 2 )For all f < g one has T(f) > T(g)must satisfy for some symmetric B in GLn and C0>0, that

( Tf()x = )C0 inf )1-<x,y>(s/f)By(. y +

(also used in A-Klartag-M)

Theorem 5: Any T: Cvx(Rn) Cvx(Rn) satisfying that


( 2 )For all f,g one has T(f g)=T(f)+T(g)must satisfy for some symmetric B in GLn that

( Tf()x( = )Lf()Bx.)

(f g()z = )inf (f(x) + g(y))x+y=z

(Part of a joint work with Semyon Alesker)