Linear and Non Linear Optimization HW2

8/11/2019 Linear and Non Linear Optimization HW2

1/11

Muhammad Junaid Farooq February 20th

, 2014

Lorenzo Ernesto Ochoa Diaz

Assignment 2

Problem 1: Problem in R100

Gradient Expression:

500

21

2(j)

1 1

ji

Tij i j

xaf

x a j x x

Gradient Expression:

222 500

2 2 2 21

2 500

21

2 2(j)

(1 a x) (1 )

(j) (k)

(1 a x)

ji

Tij i j

i i

Tik j i

xaf

x x

a af

x x

Routine for Objective function

functionf = f1(A,x)fori = 1:500;

m(i) = (A(:,i))'*x;endu = log10(1 - m);

v = log10(1-x.^2);f = -1*sum(u) -1*sum(v);end

Routine for gradient computation

close all;clear all;A = 0.5+0.5*randn([100,500]);x = 0.1* ones(100,1);f = f1(A,x);G = gradf1(A,x);H = hessf1(A,x);

[ys, his] = newtonBT(@f1, @gradf1, @hessf1, A, x);

nsteps = size(his,2);fork = 1:nstepsfhist(k) = f1(A,his(:, k));e(k) = fhist(k) - f1(A,ys);end


2/11


, 2014


figure;semilogy(1:nsteps, abs(e))xlabel('iteration count'); ylabel('|f^k - f^*|');

Routine for Hessian computation

functiong = hessf1(A,x)g = zeros(100,100);fori = 1:500;

m(i) = (1 - (A(:,i))'*x).^2;endforj = 1:100;

g(j,j) = sum(((A(j,:)).^2)./m) + (2+2*(x(j)).^2)/((1-(x(j)).^2).^2);end

forj = 1:100;fork = j+1:100;

fori = 1:500;u(i) = A(j,i)*A(k,i);

endg(j,k) = sum(u./m);g(k,j) = g(j,k);end

end

Newton Bactracking Routines

functiont = backtrackLineSearch(f, gk, pk, xk, A)a = 0.1; b = 0.8;t = 1;while( f(A,xk+t*pk) > f(A, xk) + a*t*gk'*pk )

while(cond1(A,xk) > 0 & cond2(A,xk) > 0)

t = b * t;endt = b * t;

end

function[x, hist] = newtonBT(f, grad, hess, A, x0)x = x0; hist = x0; tol = 1e-5;while(norm(grad(A,x)) > tol)

p = - hess(A,x) \ grad(A,x);t = backtrackLineSearch(f, grad(A,x), p, x, A);x = x + t * phist = [hist x];

end

functiong = cond1(A,x)fori = 1:500;

m(i) = (A(:,i))'*x;endg = min(1-m);end

functiong = cond2(A,x)fori = 1:100;


3/11


, 2014


m = 1 - (x(i)).^2;endg = min(m);end

Main Routine

close all;clear all;A = 0.5+0.5*randn([100,500]);x = 0.1* ones(100,1);f = f1(A,x);G = gradf1(A,x);H = hessf1(A,x);

[ys, his] = newtonBT(@f1, @gradf1, @hessf1, A, x);

nsteps = size(his,2);fork = 1:nstepsfhist(k) = f1(A,his(:, k));e(k) = fhist(k) - f1(A,ys);end

figure;semilogy(1:nsteps, abs(e))xlabel('iteration count'); ylabel('|f^k - f^*|');

Convergence Behavior

1 2 3 4 5 6 7 8 910

-10

10-8

10-6

10-4

10-2

100

102

104

iteration count

|fk

-f*|


4/11


, 2014


Problem 2: Minimal Surface

Routine for Objective function

functiong = f2(x,n,l,r)h = l/(n+1);a = 2*pi*h*x(1)*sqrt( 1 + (( x(1) - r )/h)^2);b = 0;fori = 1:n-1;

u = 2*pi*h*x(i)*sqrt( 1 + (( x(i+1) - x(i) )/h)^2);b = b + u;

endc = 2*pi*h*x(n)*sqrt( 1 + (( r - x(n) )/h)^2);g = a + b + c;end

Routine for gradient computation

functiong = gradf2(x,n,l,r)h = l/(n+1);a = (x(1) - r)/h;b = (x(2) - x(1))/h;c = 1 + a^2;d = 1 + b^2;u = (r-x(n))/h;v = (x(n)-x(n-1))/h;w = 1 + u^2;z = 1 + v^2;

g(1) = 2*pi*( x(1)*a*c^(-0.5) + h*sqrt(c) + h*sqrt(d) - x(1)*b*d^(-0.5) );g(n) = 2*pi*( -1*x(n)*u*w^(-0.5) + h*w^0.5 + x(n-1)*v*z^(-0.5) );form = 2:n-1

p = (x(m) - x(m-1))/h;q = 1 + p^2;s = (x(m+1) - x(m))/h;t = 1 + s^2;g(m) = 2*pi*( x(m-1)* p * q^(-0.5) + h*sqrt(t) - x(m)*s*t^(-0.5) );

endend

Routine for hessian computation

functiong = hessf2(x,n,l,r)h = l/(n+1);%g = zeros(n,n);

a = (x(1) - r)/h;b = (x(2) - x(1))/h;c = 1 + a^2;d = 1 + b^2;u = (r - x(n))/h;v = (x(n) - x(n-1))/h;w = 1 + u^2;z = 1 + v^2;


5/11


, 2014


g(1,1) = 2*pi*( -1*x(1)/h*a^2*c^-1.5 + ((2*x(1)-r)/h)*c^-0.5 + a*c^-0.5 -x(1)/h*b^2*d^-1.5 - b*d^-0.5 - ((x(2)-2*x(1))/h)*d^-0.5 );g(1,2) = 2*pi*( x(1)/h*b^2*d^-1.5 - x(1)/h*d^-1.5 + b*d^-0.5);g(n,n-1) = 2*pi*(x(n-1)/h*v^2*z^-1.5 + ((x(n)-2*x(n-1))/h)*z^-0.5 );g(n,n) = 2*pi*( -1*x(n-1)/h*v^2*z^-1.5 - ((r-2*x(n))/h)*w^-0.5 - u*w^-0.5 -x(n)/h*u^2*w^-1.5 + x(n-1)/h*z^-0.5);

form = 2:n-1p = (x(m) - x(m-1))/h;q = 1 + p^2;s = (x(m+1) - x(m))/h;t = 1 + s^2;

g(m,m) = 2*pi* ( -1*x(m-1)/h*p^2*q^(-1.5) + x(m-1)/h*q^(-0.5) -x(m)/h*s^2*t^(-1.5) - ((x(m+1)-2*x(m))/h)*t^-0.5 + s*t^-1.5);

g(m,m-1) = 2*pi* ( x(m-1)/h*p^2*q^-1.5 + ((x(m)-2*x(m-1))/h)*q^-0.5);g(m,m+1) = 2*pi* ( x(m)/h*s^2*t^-1.5 - x(m)/h*t^-0.5 + s*t^-0.5);

endend

Newton Bactracking Routines

functiont = backtrackLineSearch(f, gk, pk, xk, n,l,r)a = 0.1; b = 0.8;t = 1;while( f(xk+t*pk ,n,l,r ) > f(xk,n,l,r)+ a*t*gk'*pk )

t = b * t;end

function[x, hist] = newtonBT(f, grad, hess,x0,n,l,r)x = x0; hist = x0; tol = 1e-5;while(norm(grad(x,n,l,r)) > tol)

p = - hess(x,n,l,r) \ (grad(x,n,l,r))';t = backtrackLineSearch(f, (grad(x,n,l,r))', p, x,n,l,r);x = x + t * phist = [hist x];

end

Main Function

clear all; clcr = 1;l = 0.75;n = 20;x = ones(20,1);

[ys,hist] = newtonBT(@f2, @gradf2, @hessf2, x,n,l,r);nsteps = size(hist,2);fork = 1:nstepsfhist(k) = f2(hist(:, k),n,l,r);e(k) = fhist(k) - f2(ys,n,l,r);endfigure;semilogy(1:nsteps, abs(e))xlabel('iteration count'); ylabel('|f^k - f^*|');


6/11


7/11


, 2014


3. Problem 3. Gauss-Newton

Objective function routine

function[ f, r ] = a2_p3_fun( a, y, t )r = a(1).*sin(2.*pi.*a(2).*t)+a(3).*sin(2.*pi.*a(4).*t) - y;f = (r'*r)/2;end

Gradient routine

function[ g, J ] = a2_p3_grad( a, t, r )j1 = sin(2.*pi.*a(2).*t);j2 = a(1).*2.*pi.*t.*cos(2.*pi.*a(2).*t);

j3 = sin(2.*pi.*a(4).*t);j4 = a(3).*2.*pi.*t.*cos(2.*pi.*a(4).*t);

J = [j1,j2,j3,j4];g = J'*r;end

Gauss-Newton approximation of the Hessian routine

Implemented in the code as: (J'*J)

Gauss-Newton method for fitting

tolerance = 0.1;a = [.9;1.1;1.1;.8];[f, r] = a2_p3_fun(a, y, t);[g, J] = a2_p3_grad(a, t, r);iterations = 0;while(tolerance


8/11


, 2014


Number of iterations until the gradients norm is less than 10e-5 = 6

Note that the starting point was modified to assure convergence to the desired as. If the starting point

is selected far away from the desired minimum, this method will converge to the nearest local

minimum, which is not necessarily the desired one. Even with the modified values, the method didnt

converge to the specified minimum, it did for a minimum that has the same values of a but at different

locations in the function. If we would want to achieve the exact same values and locations of the

variables a, then the starting point should be even closer to the specified minimum.

4. Quasi-Newton methods

Insight into the formula

Bk+1is an approximation of the Hessian matrix of f(x) at the iteration k+1. This approximation is updated

at every iteration by the addition of two rank 1 matrices. The second and third matrices of the right-

hand side of the equation are symmetric and of rank 1, but have different basis, and construct a rank

two update matrix. The derivation of this formula is based on satisfying:

For each rank one matrix. That is, considering the change in the gradient at every step.


9/11


, 2014


Computational savings. Description of how the formula is obtained

This formula and the one before, can be obtained from each other by using the low rank Woodbury

update. The computational savings come by avoiding the computation of the Hessian and solving the

system of linear equations to find the Newton direction. This system of equations would require a

computational power of N3.

BFGS with a backtracking line search

%% Problem 4 BFGS, MAINclear; clc;[X1,X2] = meshgrid(-2:.1:2,-1:.1:2);Y = 10.*(X2-X1.^2).^2+(1-X1).^2;

tolerance = 10e-5;% Initial valuesx = [-1.2;1];B = eye(2);stepHist = x;iterations = 0;% Loopwhile(tolerance


10/11


, 2014


iterations

% Performance assessmentfigure('Name', 'Quasi-Newton method performance');semilogy(abs(fk(:,(length(fk(1,:))-3):length(fk(1,:))-rosenFun([1;1])))); xlabel('Number of iterations');

ylabel('Error = |fk-f*|');grid on

Number of iterations until the norm of the error is less than 10e-5: 23

Comparison with Newtons direction and steepest descent

1311

12 23

0

200

400

600

800

1000

1200

1400

Category 1

Iterations of each method until same convergence

Steepest descent Newton's method BFGS (Quasi-Newton)


11/11


, 2014


Rate of convergence with the last three iterations

This result makes sense because ultimately, this method is based on Newtons method and as we get

closer to a good approximation of the Hessian (inverse of the Hessian in this case), the convergence will

get close to being at least quadratic. Towards the end, this method achieved a convergence similar to

Newtons method, the drawback was shown at the first stages.

Linear and Non Linear Optimization HW2

Documents

Transcript of Linear and Non Linear Optimization HW2