Linear and Non Linear Optimization HW2

download Linear and Non Linear Optimization HW2

of 11

Transcript of Linear and Non Linear Optimization HW2

  • 8/11/2019 Linear and Non Linear Optimization HW2

    1/11

    Muhammad Junaid Farooq February 20th

    , 2014

    Lorenzo Ernesto Ochoa Diaz

    Assignment 2

    Problem 1: Problem in R100

    Gradient Expression:

    500

    21

    2(j)

    1 1

    ji

    Tij i j

    xaf

    x a j x x

    Gradient Expression:

    222 500

    2 2 2 21

    2 500

    21

    2 2(j)

    (1 a x) (1 )

    (j) (k)

    (1 a x)

    ji

    Tij i j

    i i

    Tik j i

    xaf

    x x

    a af

    x x

    Routine for Objective function

    functionf = f1(A,x)fori = 1:500;

    m(i) = (A(:,i))'*x;endu = log10(1 - m);

    v = log10(1-x.^2);f = -1*sum(u) -1*sum(v);end

    Routine for gradient computation

    close all;clear all;A = 0.5+0.5*randn([100,500]);x = 0.1* ones(100,1);f = f1(A,x);G = gradf1(A,x);H = hessf1(A,x);

    [ys, his] = newtonBT(@f1, @gradf1, @hessf1, A, x);

    nsteps = size(his,2);fork = 1:nstepsfhist(k) = f1(A,his(:, k));e(k) = fhist(k) - f1(A,ys);end

  • 8/11/2019 Linear and Non Linear Optimization HW2

    2/11

    Muhammad Junaid Farooq February 20th

    , 2014

    Lorenzo Ernesto Ochoa Diaz

    figure;semilogy(1:nsteps, abs(e))xlabel('iteration count'); ylabel('|f^k - f^*|');

    Routine for Hessian computation

    functiong = hessf1(A,x)g = zeros(100,100);fori = 1:500;

    m(i) = (1 - (A(:,i))'*x).^2;endforj = 1:100;

    g(j,j) = sum(((A(j,:)).^2)./m) + (2+2*(x(j)).^2)/((1-(x(j)).^2).^2);end

    forj = 1:100;fork = j+1:100;

    fori = 1:500;u(i) = A(j,i)*A(k,i);

    endg(j,k) = sum(u./m);g(k,j) = g(j,k);end

    end

    Newton Bactracking Routines

    functiont = backtrackLineSearch(f, gk, pk, xk, A)a = 0.1; b = 0.8;t = 1;while( f(A,xk+t*pk) > f(A, xk) + a*t*gk'*pk )

    while(cond1(A,xk) > 0 & cond2(A,xk) > 0)

    t = b * t;endt = b * t;

    end

    function[x, hist] = newtonBT(f, grad, hess, A, x0)x = x0; hist = x0; tol = 1e-5;while(norm(grad(A,x)) > tol)

    p = - hess(A,x) \ grad(A,x);t = backtrackLineSearch(f, grad(A,x), p, x, A);x = x + t * phist = [hist x];

    end

    functiong = cond1(A,x)fori = 1:500;

    m(i) = (A(:,i))'*x;endg = min(1-m);end

    functiong = cond2(A,x)fori = 1:100;

  • 8/11/2019 Linear and Non Linear Optimization HW2

    3/11

    Muhammad Junaid Farooq February 20th

    , 2014

    Lorenzo Ernesto Ochoa Diaz

    m = 1 - (x(i)).^2;endg = min(m);end

    Main Routine

    close all;clear all;A = 0.5+0.5*randn([100,500]);x = 0.1* ones(100,1);f = f1(A,x);G = gradf1(A,x);H = hessf1(A,x);

    [ys, his] = newtonBT(@f1, @gradf1, @hessf1, A, x);

    nsteps = size(his,2);fork = 1:nstepsfhist(k) = f1(A,his(:, k));e(k) = fhist(k) - f1(A,ys);end

    figure;semilogy(1:nsteps, abs(e))xlabel('iteration count'); ylabel('|f^k - f^*|');

    Convergence Behavior

    1 2 3 4 5 6 7 8 910

    -10

    10-8

    10-6

    10-4

    10-2

    100

    102

    104

    iteration count

    |fk

    -f*|

  • 8/11/2019 Linear and Non Linear Optimization HW2

    4/11

    Muhammad Junaid Farooq February 20th

    , 2014

    Lorenzo Ernesto Ochoa Diaz

    Problem 2: Minimal Surface

    Routine for Objective function

    functiong = f2(x,n,l,r)h = l/(n+1);a = 2*pi*h*x(1)*sqrt( 1 + (( x(1) - r )/h)^2);b = 0;fori = 1:n-1;

    u = 2*pi*h*x(i)*sqrt( 1 + (( x(i+1) - x(i) )/h)^2);b = b + u;

    endc = 2*pi*h*x(n)*sqrt( 1 + (( r - x(n) )/h)^2);g = a + b + c;end

    Routine for gradient computation

    functiong = gradf2(x,n,l,r)h = l/(n+1);a = (x(1) - r)/h;b = (x(2) - x(1))/h;c = 1 + a^2;d = 1 + b^2;u = (r-x(n))/h;v = (x(n)-x(n-1))/h;w = 1 + u^2;z = 1 + v^2;

    g(1) = 2*pi*( x(1)*a*c^(-0.5) + h*sqrt(c) + h*sqrt(d) - x(1)*b*d^(-0.5) );g(n) = 2*pi*( -1*x(n)*u*w^(-0.5) + h*w^0.5 + x(n-1)*v*z^(-0.5) );form = 2:n-1

    p = (x(m) - x(m-1))/h;q = 1 + p^2;s = (x(m+1) - x(m))/h;t = 1 + s^2;g(m) = 2*pi*( x(m-1)* p * q^(-0.5) + h*sqrt(t) - x(m)*s*t^(-0.5) );

    endend

    Routine for hessian computation

    functiong = hessf2(x,n,l,r)h = l/(n+1);%g = zeros(n,n);

    a = (x(1) - r)/h;b = (x(2) - x(1))/h;c = 1 + a^2;d = 1 + b^2;u = (r - x(n))/h;v = (x(n) - x(n-1))/h;w = 1 + u^2;z = 1 + v^2;

  • 8/11/2019 Linear and Non Linear Optimization HW2

    5/11

    Muhammad Junaid Farooq February 20th

    , 2014

    Lorenzo Ernesto Ochoa Diaz

    g(1,1) = 2*pi*( -1*x(1)/h*a^2*c^-1.5 + ((2*x(1)-r)/h)*c^-0.5 + a*c^-0.5 -x(1)/h*b^2*d^-1.5 - b*d^-0.5 - ((x(2)-2*x(1))/h)*d^-0.5 );g(1,2) = 2*pi*( x(1)/h*b^2*d^-1.5 - x(1)/h*d^-1.5 + b*d^-0.5);g(n,n-1) = 2*pi*(x(n-1)/h*v^2*z^-1.5 + ((x(n)-2*x(n-1))/h)*z^-0.5 );g(n,n) = 2*pi*( -1*x(n-1)/h*v^2*z^-1.5 - ((r-2*x(n))/h)*w^-0.5 - u*w^-0.5 -x(n)/h*u^2*w^-1.5 + x(n-1)/h*z^-0.5);

    form = 2:n-1p = (x(m) - x(m-1))/h;q = 1 + p^2;s = (x(m+1) - x(m))/h;t = 1 + s^2;

    g(m,m) = 2*pi* ( -1*x(m-1)/h*p^2*q^(-1.5) + x(m-1)/h*q^(-0.5) -x(m)/h*s^2*t^(-1.5) - ((x(m+1)-2*x(m))/h)*t^-0.5 + s*t^-1.5);

    g(m,m-1) = 2*pi* ( x(m-1)/h*p^2*q^-1.5 + ((x(m)-2*x(m-1))/h)*q^-0.5);g(m,m+1) = 2*pi* ( x(m)/h*s^2*t^-1.5 - x(m)/h*t^-0.5 + s*t^-0.5);

    endend

    Newton Bactracking Routines

    functiont = backtrackLineSearch(f, gk, pk, xk, n,l,r)a = 0.1; b = 0.8;t = 1;while( f(xk+t*pk ,n,l,r ) > f(xk,n,l,r)+ a*t*gk'*pk )

    t = b * t;end

    function[x, hist] = newtonBT(f, grad, hess,x0,n,l,r)x = x0; hist = x0; tol = 1e-5;while(norm(grad(x,n,l,r)) > tol)

    p = - hess(x,n,l,r) \ (grad(x,n,l,r))';t = backtrackLineSearch(f, (grad(x,n,l,r))', p, x,n,l,r);x = x + t * phist = [hist x];

    end

    Main Function

    clear all; clcr = 1;l = 0.75;n = 20;x = ones(20,1);

    [ys,hist] = newtonBT(@f2, @gradf2, @hessf2, x,n,l,r);nsteps = size(hist,2);fork = 1:nstepsfhist(k) = f2(hist(:, k),n,l,r);e(k) = fhist(k) - f2(ys,n,l,r);endfigure;semilogy(1:nsteps, abs(e))xlabel('iteration count'); ylabel('|f^k - f^*|');

  • 8/11/2019 Linear and Non Linear Optimization HW2

    6/11

  • 8/11/2019 Linear and Non Linear Optimization HW2

    7/11

    Muhammad Junaid Farooq February 20th

    , 2014

    Lorenzo Ernesto Ochoa Diaz

    3. Problem 3. Gauss-Newton

    Objective function routine

    function[ f, r ] = a2_p3_fun( a, y, t )r = a(1).*sin(2.*pi.*a(2).*t)+a(3).*sin(2.*pi.*a(4).*t) - y;f = (r'*r)/2;end

    Gradient routine

    function[ g, J ] = a2_p3_grad( a, t, r )j1 = sin(2.*pi.*a(2).*t);j2 = a(1).*2.*pi.*t.*cos(2.*pi.*a(2).*t);

    j3 = sin(2.*pi.*a(4).*t);j4 = a(3).*2.*pi.*t.*cos(2.*pi.*a(4).*t);

    J = [j1,j2,j3,j4];g = J'*r;end

    Gauss-Newton approximation of the Hessian routine

    Implemented in the code as: (J'*J)

    Gauss-Newton method for fitting

    tolerance = 0.1;a = [.9;1.1;1.1;.8];[f, r] = a2_p3_fun(a, y, t);[g, J] = a2_p3_grad(a, t, r);iterations = 0;while(tolerance

  • 8/11/2019 Linear and Non Linear Optimization HW2

    8/11

    Muhammad Junaid Farooq February 20th

    , 2014

    Lorenzo Ernesto Ochoa Diaz

    Number of iterations until the gradients norm is less than 10e-5 = 6

    Note that the starting point was modified to assure convergence to the desired as. If the starting point

    is selected far away from the desired minimum, this method will converge to the nearest local

    minimum, which is not necessarily the desired one. Even with the modified values, the method didnt

    converge to the specified minimum, it did for a minimum that has the same values of a but at different

    locations in the function. If we would want to achieve the exact same values and locations of the

    variables a, then the starting point should be even closer to the specified minimum.

    4. Quasi-Newton methods

    Insight into the formula

    Bk+1is an approximation of the Hessian matrix of f(x) at the iteration k+1. This approximation is updated

    at every iteration by the addition of two rank 1 matrices. The second and third matrices of the right-

    hand side of the equation are symmetric and of rank 1, but have different basis, and construct a rank

    two update matrix. The derivation of this formula is based on satisfying:

    For each rank one matrix. That is, considering the change in the gradient at every step.

  • 8/11/2019 Linear and Non Linear Optimization HW2

    9/11

    Muhammad Junaid Farooq February 20th

    , 2014

    Lorenzo Ernesto Ochoa Diaz

    Computational savings. Description of how the formula is obtained

    This formula and the one before, can be obtained from each other by using the low rank Woodbury

    update. The computational savings come by avoiding the computation of the Hessian and solving the

    system of linear equations to find the Newton direction. This system of equations would require a

    computational power of N3.

    BFGS with a backtracking line search

    %% Problem 4 BFGS, MAINclear; clc;[X1,X2] = meshgrid(-2:.1:2,-1:.1:2);Y = 10.*(X2-X1.^2).^2+(1-X1).^2;

    tolerance = 10e-5;% Initial valuesx = [-1.2;1];B = eye(2);stepHist = x;iterations = 0;% Loopwhile(tolerance

  • 8/11/2019 Linear and Non Linear Optimization HW2

    10/11

    Muhammad Junaid Farooq February 20th

    , 2014

    Lorenzo Ernesto Ochoa Diaz

    iterations

    % Performance assessmentfigure('Name', 'Quasi-Newton method performance');semilogy(abs(fk(:,(length(fk(1,:))-3):length(fk(1,:))-rosenFun([1;1])))); xlabel('Number of iterations');

    ylabel('Error = |fk-f*|');grid on

    Number of iterations until the norm of the error is less than 10e-5: 23

    Comparison with Newtons direction and steepest descent

    1311

    12 23

    0

    200

    400

    600

    800

    1000

    1200

    1400

    Category 1

    Iterations of each method until same convergence

    Steepest descent Newton's method BFGS (Quasi-Newton)

  • 8/11/2019 Linear and Non Linear Optimization HW2

    11/11

    Muhammad Junaid Farooq February 20th

    , 2014

    Lorenzo Ernesto Ochoa Diaz

    Rate of convergence with the last three iterations

    This result makes sense because ultimately, this method is based on Newtons method and as we get

    closer to a good approximation of the Hessian (inverse of the Hessian in this case), the convergence will

    get close to being at least quadratic. Towards the end, this method achieved a convergence similar to

    Newtons method, the drawback was shown at the first stages.