# Solved – How to compute the gradient for logistic regression in Matlab

I'm trying to minimize function f, firstly I was using fminsearch but it works long time, that's why now I use fminunc, but there is one problem: I need function gradient for acceleration.

``f = @(w) sum(log(1 + exp(-t .* (phis * w'))))/size(phis, 1) + coef * w*w'; options = optimset('Display', 'notify', 'MaxFunEvals', 2e+6, 'MaxIter', 2e+6); w = fminunc(f, ones(1, size(phis, 2)), options); ``
• phis size is NxN+1
• t size is Nx1
• coef is const

Can you help me please construct gradient for function f, coz I always get this warning:

``Warning: Gradient must be provided for trust-region algorithm;   using line-search algorithm instead. ``
Contents

The gradient should be (by chain rule)

``%the gradient %helper function expt =  @(w)(exp(-t .* (phis * w'))); %precompute -t * phis tphis = -diag(t) * phis;  %or bsxfun(@times,t,phis); %the gradient gradf = @(w)((sum(bsxfun(@times,expt(w) ./ (1 + expt(w)), tphis),1)'/size(phis,1)) + 2*coef * w'); ``

probably would be faster not to compute `expt(w)` twice per evaluation, so you can rewrite this in terms of another anonymous function which takes `exptw` as input.

also I may have goofed up the dimensions on the sum–it seems like you are using `w` as a row vector, which is somewhat nonstandard.

edit: as @whuber noted, this kind of thing is easy to screw up. I didn't actually try the code I had previously. the above should be correct now. To test it, I estimated the gradient numerically and compared to the 'exact' value, as below:

``%set up the problem N = 9; phis = rand(N,N+1); t = rand(N,1); coef = rand(1);  %the objective f = @(w)((sum(log(1 + exp(-t .* (phis * w'))),1) / size(phis, 1)) + coef * w*w');  %helper function expt =  @(w)(exp(-t .* (phis * w'))); %precompute -t * phis tphis = -diag(t) * phis;  %or bsxfun(@times,t,phis); %the gradient gradf = @(w)((sum(bsxfun(@times,expt(w) ./ (1 + expt(w)), tphis),1)'/size(phis,1)) + 2*coef * w');  %test the code now: %compute the approximate gradient numerically w0 = randn(1,N+1); fw = f(w0);  %%the numerical: delta = 1e-6; eyeN = eye(N+1);  gfw = nan(size(w0)); for iii=1:numel(w0)     gfw(iii) = (f(w0 + delta * eyeN(iii,:)) - fw) ./ delta; end  %the 'exact': truegfw = gradf(w0);  %report fprintf('max difference between exact and numerical is %gn',max(abs(truegfw' - gfw))); ``

when I run this (sorry, should have set the rand seed), I get:

`max difference between exact and numerical is 4.80006e-07`

YMMV, depending on the rand seed and the value of `delta` used.

Rate this post