Quadratic form and functional optimization
-
Upload
junpei-tsuji -
Category
Education
-
view
606 -
download
0
description
Transcript of Quadratic form and functional optimization
![Page 1: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/1.jpg)
Quadratic Form and Functional Optimization
9th June, 2011 Junpei Tsuji
![Page 2: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/2.jpg)
Optimization of multivariate quadratic function
๐ฝ ๐ฅ1, ๐ฅ2 = 1.2 + 0.2, 0.3๐ฅ1๐ฅ2 +
12๐ฅ1, ๐ฅ2
3 11 4
๐ฅ1๐ฅ2
= 1.2 + 0.2๐ฅ1 + 0.3๐ฅ2 +32๐ฅ12 + ๐ฅ1๐ฅ2 + 2๐ฅ22
๐ฅ1, ๐ฅ2, ๐ฝ = 0.045, 0.064, 1.1881
![Page 3: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/3.jpg)
Quadratic approximation By Taylor's expansion
๐ ๐ โ ๐ฬ + ๏ฟฝฬ ๏ฟฝ โ ๐ โ ๐๏ฟฝ +12 ๐ โ ๐๏ฟฝ ๐๐ฏ๏ฟฝ ๐ โ ๐๏ฟฝ
where
โข ๐ โถ= ๐ฅ1, ๐ฅ2,โฏ๐ฅ๐๐
โข ๐ฬ โถ= ๐ ๐๏ฟฝ
โข ๏ฟฝฬ ๏ฟฝ: = ๐๐๐๐ฅ1
, ๐๐๐๐ฅ2
,โฏ , ๐๐๐๐ฅ๐ ๐=๐๏ฟฝ
Jacobian (gradient)
โข ๐ฏ๏ฟฝ โถ=
๐2๐๐๐ฅ1๐๐ฅ1
โฏ ๐2๐๐๐ฅ1๐๐ฅ๐
โฎ โฑ โฎ๐2๐
๐๐ฅ๐๐๐ฅ1โฏ ๐2๐
๐๐ฅ๐๐๐ฅ๐ ๐=๐๏ฟฝ
Hessian (constant)
quadratic form constant linear form
![Page 4: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/4.jpg)
Completing the square
๐ ๐ = ๐ฬ + ๏ฟฝฬ ๏ฟฝ โ ๐ โ ๐๏ฟฝ +12๐ โ ๐๏ฟฝ ๐๐ฏ๏ฟฝ ๐ โ ๐๏ฟฝ
โข Let ๐๏ฟฝ = ๐โ where ๐ฑ ๐โ ๐ = ๐ then
๐ ๐ = ๐โ +12๐ โ ๐โ ๐๐ฏโ ๐ โ ๐โ
quadratic form constant
![Page 5: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/5.jpg)
Completing the square ๐ ๐ = ๐ + ๐๐๐ +
12๐๐๐จ๐
๐ ๐ = ๐ +12๐ โ ๐0 ๐๐จ ๐ โ ๐0
= ๐ +12๐0๐๐จ๐0 โ
12๐0๐ ๐จ + ๐จ๐ ๐ +
12๐๐๐จ๐
โข ๐๐ = โ12๐0๐ ๐จ + ๐จ๐
๐0๐ = โ2๐๐ ๐จ + ๐จ๐ โ1 ๐0 = โ2 ๐จ + ๐จ๐ โ1๐
โข ๐ = ๐ + 12๐0๐๐จ๐0
๐ = ๐ โ12๐0
๐๐จ๐0 = ๐ โ 2๐๐ ๐จ + ๐จ๐ โ1๐จ ๐จ + ๐จ๐ โ1๐
Therefore, ๐ ๐ = ๐ โ 2๐๐ ๐จ + ๐จ๐ โ1๐จ ๐จ + ๐จ๐ โ1๐
+12 ๐ + 2 ๐จ + ๐จ๐ โ1๐ ๐๐จ ๐ + 2 ๐จ + ๐จ๐ โ1๐
โข If ๐จ was symmetric matrix,
๐ ๐ = ๐ โ12๐
๐๐จโ1๐ +12 ๐ + ๐จโ1๐ ๐๐จ ๐ + ๐จโ1๐
![Page 6: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/6.jpg)
Quadratic form
๐ ๐๐ = ๐๐๐๐บ๐๐ where โข ๐บ is symmetric matrix.
![Page 7: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/7.jpg)
Symmetric matrix โข Symmetric matrix ๐บ is defined as a matrix that satisfies the
following formula: ๐บ๐ = ๐บ
โข Symmetric matrix ๐บ has real eigenvalues ๐๐ and
eigenvectors ๐๐ that consist of normal orthogonal base. where
๐บ๐๐ = ๐๐๐๐ ๐1 โฅ ๐2 โฅ โฏ โฅ ๐๐
๐๐ ,๐๐ = ๐ฟ๐๐ ๐ฟ๐๐ is Kronecker's delta
![Page 8: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/8.jpg)
Diagonalization of symmetric matrix
โข We define an orthogonal matrix ๐ผ as follows: ๐ผ = ๐1,๐2,โฏ ,๐๐
โข Then, ๐ผ satisfies the following formulas: ๐ผ๐๐ผ = ๐ฐ
โด ๐ผโ1 = ๐ผ๐ โข where ๐ฐ is an identity matrix.
๐บ๐ผ = ๐บ ๐1,๐2,โฏ ,๐๐ = ๐บ๐1,๐บ๐2,โฏ ,๐บ๐๐
= ๐1๐1, ๐2๐2,โฏ , ๐๐๐๐ = ๐1,โฏ ,๐๐๐1 โฑ ๐๐
= ๐ผ ๐๐๐๐ ๐1, ๐2,โฏ , ๐๐ โด ๐บ = ๐ผ ๐๐๐๐ ๐1, ๐2,โฏ , ๐๐ ๐ผ๐
![Page 9: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/9.jpg)
Transformation to principal axis
๐ ๐๐ = ๐๐๐๐บ๐๐ โข Then, we assume ๐๐ = ๐ผ๐๐, where ๐ =
๐ง1, ๐ง1,โฏ , ๐ง๐ .
๐ ๐ผ๐๐ = ๐ผ๐๐ ๐๐บ ๐ผ๐๐ = ๐๐๐ผ๐บ๐ผ๐๐= ๐๐ ๐๐๐๐ ๐1, ๐2,โฏ , ๐๐ ๐
โด ๐ ๐ = ๏ฟฝ๐๐๐ง๐2๐
๐=1
![Page 10: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/10.jpg)
Contour surface
โข If we assume ๐ ๐ equals constant ๐,
๐ ๐ = ๏ฟฝ๐๐๐ง๐2๐
๐=1
= ๐
โข When ๐ = 2, โ a locus of ๐ illustrates an ellipse if ๐1๐2 > 0. โ a locus of ๐ illustrates a hyperbola if ๐1๐2 < 0.
![Page 11: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/11.jpg)
Contour surface
๐ง1
๐ง2
๐ ๐ = ๏ฟฝ๐๐๐ง๐22
๐=1
= ๐๐๐๐๐.
๐1๐2 > 0
๐ ๐ฅ1, ๐ฅ2 = โ๐ฅ12 โ 2๐ฅ22 + 20.0
maximal or minimal point
![Page 12: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/12.jpg)
Transformation to principal axis
๐ ๐๐ = ๐๐๐๐๐.
๐ฅ๐1
๐ฅ๐2
๐๐ = ๐ผ๐๐ โด ๐ = ๐ผ๐โฒ
Transformation to principal axis
![Page 13: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/13.jpg)
๐ฅ๐1
๐ฅ๐2
Parallel translation
๐ ๐ = ๐๐๐๐๐.
๐ฅ1
๐ฅ2
๐๏ฟฝ
๐๐ = ๐ โ ๐๏ฟฝ
![Page 14: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/14.jpg)
Contour surface of quadratic function
๐ ๐ = ๐โ +12๐ โ ๐โ ๐๐ฏโ ๐ โ ๐โ
๐ ๐ = ๐๐๐๐๐.
๐ฅ1
๐ฅ2
๐๏ฟฝ
![Page 15: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/15.jpg)
Contour surface
๐ ๐ = ๏ฟฝ๐๐๐ง๐22
๐=1
= ๐๐๐๐๐.
๐1๐2 < 0 ๐ง1
๐ง2
saddle point
๐ ๐ฅ1, ๐ฅ2 = ๐ฅ12 โ ๐ฅ22
![Page 16: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/16.jpg)
Stationary points
saddle point maximal point
๐ ๐ฅ1, ๐ฅ2 = ๐ฅ13 + ๐ฅ23 + 3๐ฅ1๐ฅ2 + 2
![Page 17: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/17.jpg)
Stationary points
maximal point saddle point
๐ ๐ฅ1, ๐ฅ2 = exp โ13๐ฅ13 + ๐ฅ1 โ ๐ฅ22
![Page 18: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/18.jpg)
![Page 19: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/19.jpg)
Newton-Raphson method
โข Newtonโs method is an approximate solver of ๐๐ ๐ = ๐ where ๐ ๐ is ๐-th polynomial by using a quadratic approximation.
๐ ๐
quadratic approximation of ๐ ๐ in ๐
๐๐ ๐โ = ๐
๐ ๐ ๐ + ๐ซ๐ ๐โ
๐ ๐ + ฮ๐ โ ๐ ๐ + ๐ฑ ๐ โ ฮ๐ +12ฮ๐๐๐ฏ ๐ ฮ๐
๐๐ ๐ + ฮ๐๐ ฮ๐
= ๐ฑ ๐ ๐ + ๐ฏ ๐ ฮ๐
![Page 20: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/20.jpg)
Algorithm of Newtonโs method
Procedure Newton (๐ฑ ๐ , ๐ฏ ๐ ) 1. Initialize ๐. 2. Calculate ๐ฑ ๐ and ๐ฏ ๐ . 3. Solve the following simultaneous
equation and giving โ๐ : ๐ฑ ๐ ๐ + ๐ฏ ๐ โ๐ = ๐
4. Update ๐ as follows: ๐ โ ๐ + โ๐
5. If โ๐ < ๐ฟ then return ๐ else go back to 2.
![Page 21: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/21.jpg)
Linear regression
๐๐ ,๐ฆ๐
๐
๐ฆ ๐ฆ = ๐ ๐ = ๐ฝ0 + ๏ฟฝ๐ฝ๐๐ฅ๐
๐
๐=1
We would like to find ๐ทโ that minimizes the residual sum of square (RSS).
๐ samples
๐-th dimensional space
![Page 22: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/22.jpg)
Linear regression
min๐ท
RSS ๐ท โข where
RSS ๐ท = ๏ฟฝ ๐ฆ๐ โ ๐ ๐๐ 2๐
๐=1
= ๏ฟฝ ๐ฆ๐ โ ๐ฝ0 + ๏ฟฝ๐ฝ๐๐ฅ๐๐
๐
๐=1
2๐
๐=1
โข Given ๐ฟ,๐,๐ท as follows:
๐ฟ =๐ฅ11 โฏ ๐ฅ1๐โฎ โฑ โฎ๐ฅ๐1 โฏ ๐ฅ๐๐
1โฎ1
, ๐ =๐ฆ1โฎ๐ฆ๐
, ๐ท =๐ฝ1โฎ๐ฝ๐
โด RSS ๐ท = ๐ โ ๐ฟ๐ท 2
![Page 23: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/23.jpg)
Linear regression
RSS ๐ท = ๐ฝ ๐ท = ๐ โ ๐ฟ๐ท 2 = ๐ โ ๐ฟ๐ท ๐ ๐ โ ๐ฟ๐ท = ๐๐๐ โ ๐ท๐๐ฟ๐๐ โ ๐๐๐ฟ๐ท + ๐ท๐๐ฟ๐๐ฟ๐ท
โข ๐๐๐ท
๐๐๐ท = ๐
โข ๐๐๐ท
๐ท๐๐ = ๐
โข ๐๐๐ท
๐ท๐๐จ๐ท = ๐จ
๐ฝโฒ ๐ท =๐๐ฝ๐๐ท
= โ2๐ฟ๐๐ + 2๐ฟ๐๐ฟ๐ท
![Page 24: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/24.jpg)
Linear regression Given ๐ทโ that satisfies ๐ฝโฒ ๐ทโ = ๐,
๐ฟ๐๐ = ๐ฟ๐๐ฟ๐ทโ ๐๐๐ฟ = ๐ทโ๐๐ฟ๐๐ฟ
โด ๐ทโ = ๐ฟ๐๐ฟ โ1๐ฟ๐๐ โด ๐ฝ ๐ท = ๐๐๐ โ ๐ท๐๐ฟ๐๐ฟ๐ทโ โ ๐ทโ๐๐ฟ๐๐ฟ๐ท + ๐ท๐๐ฟ๐๐ฟ๐ท โด ๐ฝ ๐ท
= ๐๐๐ โ ๐ทโ๐๐ฟ๐๐ฟ๐ทโ + ๐ทโ๐๐ฟ๐๐ฟ๐ทโ โ ๐ท๐๐ฟ๐๐ฟ๐ทโ
โ ๐ทโ๐๐ฟ๐๐ฟ๐ท + ๐ท๐๐ฟ๐๐ฟ๐ท โด ๐ฝ ๐ท = ๐๐๐ โ ๐ทโ๐๐ฟ๐๐ฟ๐ทโ + ๐ท โ ๐ทโ ๐๐ฟ๐๐ฟ ๐ท โ ๐ทโ
completing the square
![Page 25: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/25.jpg)
Linear regression
๐ฝ ๐ท = ๐๐๐ โ ๐ทโ๐๐ฟ๐๐ฟ๐ทโ + ๐ท โ ๐ทโ ๐๐ฟ๐๐ฟ ๐ท โ ๐ทโ = ๐ โ ๐ฟ๐ทโ 2 + ๐ท โ ๐ทโ ๐๐ฟ๐๐ฟ ๐ท โ ๐ทโ
= ๐ฝ ๐ทโ +12๐ท โ ๐ทโ ๐๐ฏ ๐ท โ ๐ทโ
quadratic form Residual sum of squares (RSS) by Linear Regression
๐ฝ ๐ท = ๐๐๐๐๐.
๐ฝ1
๐ฝ2
๐ทโ
๐ทโ = ๐ฟ๐๐ฟ โ1๐ฟ๐๐ ๐ฏ = 2๐ฟ๐๐ฟ
![Page 26: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/26.jpg)
Hessian
โข ๐ฏ โ ๐2๐ฝ๐๐ฝ๐๐๐ฝ๐
= 2๐ฟ๐๐ฟ
โข ๐ฏ has the following two features: โ symmetric matrix: ๐ฏ๐ = ๐ฏ โ positive-definite matrix: ๐โ โ ๐, ๐๐๐ฏ๐ > 0
Therefore, ๐ทโ = ๐ฟ๐๐ฟ โ1๐ฟ๐๐ is the minimum of ๐ฝ ๐ท .
![Page 27: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/27.jpg)
Analysis of residuals
๐โ = ๐ฟ๐ทโ โข Then, we substitute ๐ทโ = ๐ฟ๐๐ฟ โ1๐ฟ๐๐ in the above,
๐โ = ๐ฟ๐ทโ = ๐ฟ ๐ฟ๐๐ฟ โ1๐ฟ๐ ๐
โด ๐โ = โ๐ (Hat matrix) โข the vector of residuals ๐ can be expressed by follows:
๐ = ๐ โ ๐โ = ๐ โโ๐ = ๐ฐ โโ ๐ ๐๐๐ ๐ = ๐๐๐ ๐ฐ โโ ๐ = ๐ฐ โโ ๐๐๐ ๐ ๐ฐ โโ ๐
![Page 28: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/28.jpg)
Analysis of residuals
โ = ๐ฟ ๐ฟ๐๐ฟ โ1๐ฟ๐ The hat matrix โ is a projection matrix, which satisfies the following equations: 1. Projection: โ2 = โ
โ2 = โ โโ = ๐ฟ ๐ฟ๐๐ฟ โ1๐ฟ๐ โ ๐ฟ ๐ฟ๐๐ฟ โ1๐ฟ๐ = ๐ฟ ๐ฟ๐๐ฟ โ1 ๐ฟ๐๐ฟ ๐ฟ๐๐ฟ โ1๐ฟ๐
= ๐ฟ ๐ฟ๐๐ฟ โ1๐ฟ๐ = โ
2. Orthogonal: โ๐ = โ
![Page 29: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/29.jpg)
Analysis of residuals
๐ฆ1โโฎ๐ฆ๐โ
=๐ฅ11 โฏ ๐ฅ1๐โฎ โฑ โฎ๐ฅ๐1 โฏ ๐ฅ๐๐
1โฎ1
๐ฝ1โ
โฎ๐ฝ๐
โ
๐ฝ0โ
= ๐ฝ1โ๐ฅ11โฎ๐ฅ๐1
+ โฏ+ ๐ฝ๐โ
๐ฅ1๐โฎ
๐ฅ๐๐+ ๐ฝ0
โ1โฎ1
linear combination in ๐ + 1 -th vector space
๐1 ๐๐ ๐๐+1 = ๐
![Page 30: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/30.jpg)
Analysis of residuals
๐๐
๐๐
๐โ
๐
๐-th dimensional space
๐โ = โ๐ (Projection)
๐ + 1 -th dimensional super surface
![Page 31: Quadratic form and functional optimization](https://reader033.fdocuments.us/reader033/viewer/2022060116/557dc639d8b42a8f188b5501/html5/thumbnails/31.jpg)
Analysis of residuals
๐ = ๐ฟ๐ท โข ๐ท = ๐ฟโ1๐, where ๐ฟโ1 is M-P generalized inverse.
1. Unique solution: ๐ = ๐ 2. Many solutions: ๐ > ๐ 3. No solution: ๐ < ๐
โข ๐ฟโ1 = ๏ฟฝ๐ฟโ1
๐ฟ๐ ๐ฟ๐ฟ๐ โ1 ๐ท = ๐ฟโ1๐ is min in ๐ท๐ฟ๐๐ฟ โ1๐ฟ๐ ๐ โ ๐ฟ๐ท 2 is min