Download - A geometric proof of the spectral theorem for real ...math.gmu.edu/~Rsachs/Talks/Maasymmetricdiagsbeamer.pdfWeierstrass, Fischer, Weyl, Courant 3 In usual proof orthogonality is “accidental”

A geometric proof of the spectral theorem for realsymmetric matrices

Robert Sachs

Department of Mathematical SciencesGeorge Mason University

Fairfax, Virginia 22030

[email protected]

January 6, 2011

R. Sachs (GMU) Geometric spectral theorem proof January 2011 1 / 21

Introduction

1 Many students find this topic difficult

2 Teaching history of math for first time: Euler, Cauchy, Sturm,Weierstrass, Fischer, Weyl, Courant

3 In usual proof orthogonality is “accidental” via symmetric matrixand inner product

4 Working on multivariable calculus book and want to do Lagrangemultiplier idea without assuming linear algebra


Introduction

1 Many students find this topic difficult2 Teaching history of math for first time: Euler, Cauchy, Sturm,

Weierstrass, Fischer, Weyl, Courant

3 In usual proof orthogonality is “accidental” via symmetric matrixand inner product



Introduction


Weierstrass, Fischer, Weyl, Courant3 In usual proof orthogonality is “accidental” via symmetric matrix

and inner product



Introduction


Weierstrass, Fischer, Weyl, Courant3 In usual proof orthogonality is “accidental” via symmetric matrix

and inner product4 Working on multivariable calculus book and want to do Lagrange

multiplier idea without assuming linear algebra


REPRISE OF USUAL ARGUMENTS

Three main strategies: algebraic, analytic, computational

Algebraic works from Invariant Subspaces, MinimalPolynomial, Show Orthogonality, Geometric and AlgebraicDimensions Equal.

Analytic uses Lagrange Multipliers, Orthogonality constraints(later seen inactive),

Numerical uses Givens Rotations (Euler for principal axes in3-D), Orthogonality leads to symmetric diagonalization


KEY BACKGROUND

Quadratic equations tied to matrix form: Q(v) = vT H v with Hsymmetric

n = 2 case is cute use of trig (more below) and the key stepScaling quadratically suggests looking on unit sphereMin and max on sphere are eigenvectors (Lagrange multipliers forunit vector constraint)Restrictions to subspaces are also quadratic forms


KEY BACKGROUND

Quadratic equations tied to matrix form: Q(v) = vT H v with Hsymmetricn = 2 case is cute use of trig (more below) and the key step

Scaling quadratically suggests looking on unit sphereMin and max on sphere are eigenvectors (Lagrange multipliers forunit vector constraint)Restrictions to subspaces are also quadratic forms


KEY BACKGROUND

Quadratic equations tied to matrix form: Q(v) = vT H v with Hsymmetricn = 2 case is cute use of trig (more below) and the key stepScaling quadratically suggests looking on unit sphere

Min and max on sphere are eigenvectors (Lagrange multipliers forunit vector constraint)Restrictions to subspaces are also quadratic forms


KEY BACKGROUND

Quadratic equations tied to matrix form: Q(v) = vT H v with Hsymmetricn = 2 case is cute use of trig (more below) and the key stepScaling quadratically suggests looking on unit sphereMin and max on sphere are eigenvectors (Lagrange multipliers forunit vector constraint)

Restrictions to subspaces are also quadratic forms


KEY BACKGROUND

Quadratic equations tied to matrix form: Q(v) = vT H v with Hsymmetricn = 2 case is cute use of trig (more below) and the key stepScaling quadratically suggests looking on unit sphereMin and max on sphere are eigenvectors (Lagrange multipliers forunit vector constraint)Restrictions to subspaces are also quadratic forms


THE CASE OF n = 2

Q(x,y) = a x2 + 2b x y + c y2

Matrix form:

rT(

a bb c

)r

where r is position vector.On unit circle, x = cos t and y = sin tRestricted form is a cos2 t + 2 b cos t sin t + c sin2 tReexpressed as a+c

2 + a−c2 cos 2t + b sin 2t and also as

a+c2 + A cos(2t + φ)

Amplitude A satisfies A2 = (a−c2 )2 + b2 = (a+c

2 )2 + (b2 − ac)which leads to description of max and min values as well asaverage value over circle.Orthogonality of min and max vectors is basic trig!Role of discriminant / determinant in definiteness


MOVING UP A DIMENSION VIA MIN-MAX

Consider three variable case

Max, min values are usual Lagrange multiplier rule (multivariablecalculus)

Third orthogonal direction as eigenvector less clear

Use previous step and restriction to any plane through origin ...have orthogonality of restricted max, min directions


MOVING UP A DIMENSION AND THEMIN-MAX ISSUE – cont.

Restricted form is a nice linear algebra calculation: if we look atvectors sv1 + tv2 in quadratic form, it becomes:

(s t

)VT H V

(st

)where V has vectors v1 and v2 as columns. More on this at end.

Each plane containing origin has minimizing direction orthogonalto maximizing direction, so if we max over mins on 2-Dsubspaces with unit vector restriction, can restrict to a great circleorthogonal to maximizing direction.

Key point: Claim that extreme vector is an eigenvector also ... i.e.gradient is aligned in the direction of the vector.


EIGENVECTOR CLAIM /ORTHOGONALITY

In 3-D scenario, at minimax location, gradient of quadraticvanishes in admissible variation direction (angular along greatcircle)

Why can’t gradient have component in direction of maximaleigenvector? Which direction? (both vector and its negative arecritical points, same value!)

Conclude: zero component BY REFLECTION SYMMETRY /EVENNESS OF QUADRATIC!!

This holds for all 3-planes in n dimensions – each comesalgebraically as having a symmetric matrix hence quadratic formon restriction. Details later as time permits.


MIN-MAX VIEW OF EIGENVALUES

For all 2-D subspaces, can take min-max or max-min which in 3-Dhappen at the same place.

For higher dimensions, the min-max and max-min are typicallydifferent.

Fischer seems to be the first to do this; Courant exploited it morefully (a Wikipedia discussion on this is useless).

Continue inductively, building on higher dimensional subspaceswith orthogonality going up with dimension.


VISUALIZATIONS


VISUALIZATIONS – continued


SOME DETAILS OF USUAL PROOFS

STEP 1: Eigenvalues must be real.

Suppose not, then there is a complex conjugate pair of roots ofcharacteristic polynomial since matrix is real.

Complex eigenvalue implies complex eigenvector

Use complex conjugate and transpose together, i.e. Hermitianconjugate, to get a contradiction, as follows:

v̄T A v = λ v̄T v from original equation A v = λv after leftmultiplication by v̄T . But taking complex conjugate transpose ofA v = λv and then right multiplying by v we get (using AT = A andA real) the same left hand side but on the right λ̄ v̄T v so weconclude, since v 6= , that λ = λ̄


SOME DETAILS OF USUAL PROOFS –continued

Now there is a fork in the road – algebra proof vs. analysis proof. Firstone of the algebra versions:

STEP 2a: Strip off rank one piece, look on orthogonalcomplement of span of first eigenvector.

Show that if vT e1 = , then (Av)T e1 =

Done with our favorite algebraic lemma.

Create new orthonormal basis starting with e1 then write matrix inthat basis, find (n − 1)× (n − 1) block and λ1 in upper corner withrest of first row/column zeroed out.

STEP 3a: Continue in dimension n − 1, adding in first entry to getback to original dimension. Find second eigenvector, repeat STEP2a.


SOME DETAILS OF USUAL PROOFS –continued

STEP 2b: Find maximum with two constraints: unit vectors, alsoorthogonal to first eigenvector.

Two Lagrange multipliers – one for unit vector (λ) and a secondone for orthogonality (µ).

Equation: Av = λ v + µe1

And then a miracle happens: µ = by our favorite lemma.

Find second eigenvector, then add that constraint, which is alsoinactive – repeat until done.


SUBSPACE AND RESTRICTION

In subspace the vectors are linear combinations of some basiselements – columns of a rectangular matrix

View it as matrix product – algebra leads to restricted matrix ofform: CT A C where C has columns given by basis vector – newmatrix is lower rank, symmetric.


EXTENSION / INTEGRATION

Student question: What is gradient of vT A v when A is square butnot symmetric?

Representative of equivalence class of A under similarity – issueof transpose vs. inverse

Length of vectors in subspace squared (case of H = I) is useful inthinking about surfaces, differential geometry (First FundamentalForm)

Spherical and ultraspherical coordinates on unit n-sphere


EXTENSION / INTEGRATION – cont.

Attempt to develop theory for constrained max/min and Hessianmatrix

Eigenvalues of AT A when A is rectangular

Complex eigenvalues for non-symmetric real A – what do theymean geometrically

For complex vector spaces, how is symmetric matrix extended?


EXTENSION / INTEGRATION (continued)

Non-symmetric matrices and Jordan form

Schur Factorization

Ratios of quadratic expressions (non-zero denominator!) ties toCurvature calculations on surfaces.

Classical view near max/min in 2-D: values taken on twice leadsto discriminant – classic principal curvatures computation.

Fun in number theory: quadratic forms – normal form usinginteger lattice transformations


CONCLUDING REMARKS

Orthogonality now comes from 2-D geometryexploited ruthlessly

Rich area for visualization, experiment, conjecturein high school

Hate to banish really slick lemmas – love thealgebraic fun