-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new lm minimiser #208
Add new lm minimiser #208
Conversation
2 tests omitted for lm4 - seems to be issue if parameter exactly at bound
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #208 +/- ##
==========================================
+ Coverage 42.86% 43.29% +0.42%
==========================================
Files 242 243 +1
Lines 16240 16383 +143
==========================================
+ Hits 6962 7093 +131
- Misses 9278 9290 +12 ☔ View full report in Codecov by Sentry. |
Test Results 4 files ± 0 124 suites ±0 4m 47s ⏱️ -2s Results for commit 1fc202a. ± Comparison against base commit 73f604a. This pull request removes 2 and adds 42 tests. Note that renamed tests count towards both.
♻️ This comment has been updated with latest results. |
In ndbase.lm4 and simplex optimisers. Also added tests This allows support for ND fitting problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this. It should probably be two different PR, one for the residual return function and one for lm4
but I think it's fine since the residual stuff is small; but I'll split my comments on the two parts:
- The residual stuff is good. I don't have a problem with using the
resid_handle
flag. The only other way I can think of to check it is to evaluate the model function and check if it returns a scalar or not but this could be costly just on construction to test what kind of input it is. Forlm4
where we evaluate to get the starting cost value then this might be a way but I'm not sure if thesimplex
minimiser does this and I don't know how to implement it since the wrapper is separate from the mimisers... - The docstring for
lm4
needs a lot of updating. I guess it was taken from another function? - Some of the unit tests (especially for bounded problems) are disabled for
lm4
but work forsimplex
- is this a weakness oflm4
? Should we note in the docstring that bounded fitting withlm4
is a bit flaky and maybe suggest users to specify larger bounds?
end | ||
|
||
function test_init_with_fcost_all_params_fixed(testCase) | ||
% note second param outside bounds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment is wrong? It doesn't look like you've set any bounds here, just fixed the parameter values?
Looks like this was copied from test_init_with_fcost_no_bounds_with_fixed_param_using_ifix
which also has the same (erroneous?) comment in line 96.
testCase.verify_val(cost_val, 0, 'abs_tol', 1e-6); | ||
end | ||
|
||
function test_optimise_residual_array_lm(testCase, optimiser) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test should also run for ndbase.simplex
too, right? Should the _lm
suffix be removed?
testCase.verify_val(pars_fit, testCase.rosenbrock_minimum, 'abs_tol', 1e-3); | ||
testCase.verify_val(cost_val, 0, 'abs_tol', 1e-6); | ||
end | ||
|
||
function test_optimise_rosen_both_bounds_minimum_not_accessible(testCase, optimiser) | ||
function test_optimise_rosen_both_bounds_minimum_not_accessible(testCase) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So in this test the bounds given are all outside the optimum parameter values of (1, 1)
- in this case would we expect all minimisers to try to find the optimum value within the bounds? If so, do you know why lm4
isn't able to do this? Is it something to do with how we've implemented the bounds? (Is it because the gradient becomes too steep near the boundaries?)
(Just running it manually with these bounds and the Rosenbrock function I get optimised parameters (-0.5, -0.5)
which is just the lower bound... likewise with the next test (fixed_minimum_not_accessible
) where lm4
just returns the lower bound.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checked and lsqnonlin
does find the lowest cost val here at (0,0)
I think the problem could be solved by moving slightly away from the boundary edge when we reset invalid parameters outside bounds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeh if you change to have the initial guess 0.1
inside the lb
like so
'lb', [-1.1, -1.1]
This test passes for lm4
minimiser
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just had a play and it still seems flaky. For some bounds it works and for others it doesn't... I wonder if we have to transform the diff_step
too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the diff step has to be taken in the transformed coordinate so as not to go over the bounds - the diff step is relative - i.e. scaled by the magnitude of the scaled/free parameter rather than the bound one which I also think it correct. We can discuss tomorrow!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So with implementing the box start parameters in this gist I could get all the tests to pass with for both simplex
and lm4
, even with the original bounds in both_bounds_minimum_accessible
and with a tighter tolerance for upper_bound_minimum_not_accessible
(1e-4
compared to 1e-6
for simplex
), but I had to change the default nu_up
from 10
to 5
in order to get those two tests to fit with the original bounds and tighter tolerance... not sure if this is ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mducle - I'll look at this today - in theory I don't have a problem with changing nu_up=5
as long as it still passes the NIST data sets.
Here is the advice given on wikipedia...
An effective strategy for the control of the damping parameter, called delayed gratification, consists of increasing the parameter by a small amount for each uphill step, and decreasing by a large amount for each downhill step. The idea behind this strategy is to avoid moving downhill too fast in the beginning of optimization, therefore restricting the steps available in future iterations and therefore slowing down convergence.[7] An increase by a factor of 2 and a decrease by a factor of 3 has been shown to be effective in most cases, while for large problems more extreme values can work better, with an increase by a factor of 1.5 and a decrease by a factor of 5.[8]
Which I think (if I'm interpreting it correctly) would correspond to nu_up=2, nu_dn=0.33
(similar value to existing default value for nu_dn
). I will try nu_up = 10 (existing default), 5, 2
and see how they compare in terms of the number of iterations as well as accuracy. FYI I got my value of nu_up
from numerical recipes (or something based on numerical recipes...).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I quickly tried this on the Gauss3
NIST dataset (so-called average difficulty with 8 parameters) and get these results (note a negative dcost
indicates the chi-squared value we found is larger than the NIST provided value - but as you can see only by a very small amount, this could probably be reduced by changing the default convergence criteria!)
% minimising residual array
% nu_up nu_dn iterations nhess_evals dcost(%)
% 10 0.3 11 10 -1.1e-9
% 5 0.3 9 8 -1.1e-9
% 2 0.3 11 8 -1.1e-9
% 1.5 0.2 15 8 -1.1e-9
% minimising scalar
% nu_up nu_dn iterations nhess_evals dcost(%)
% 10 0.3 12 10 -1.2e-7
% 5 0.3 17 8 -6.5e-8
% 2 0.3 16 10 -1.2e-7
% 1.5 0.2 21 9 -1.2e-7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried fitting the Eckerle4
NIST dataset (high difficulty - 3 parameters) - note lm3
can't fit this
Here are the results
% minimising residual array
% nu_up nu_dn iterations nhess_evals dcost(%)
% 10 0.3 121 81 -5.7e-5
% 5 0.3 64 39 -1.6e-5
% 2 0.3 136 54 -2.3e-7
% 1.5 0.2 166 38 -3.1e-5
% minimising scalar
% nu_up nu_dn iterations nhess_evals dcost(%)
% 10 0.3 55 38 -0.7
% 5 0.3 47 29 -0.7
% 2 0.3 76 17 -2e4 % very bad fit!
% 1.5 0.2 74 19 -0.7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like nu_up=5, nu_dn=0.3
is a bit of a sweet spot! Albeit having not given it the rigorous fitbenchmarking treatment...
stat.func = cost_func_wrap.cost_func; | ||
stat.param = param; | ||
stat.param.Np = cost_func_wrap.get_num_free_parameters(); | ||
stat.msg = message; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the docstring this is message
- it might be good to change it here to match the docstring...
Thanks for the review @mducle - sorry for the long PR, it probably should have been 2 as you say! I've been on unscheduled leave most of the last two days so as didn't get a chance to give final polish - in particular I didn't get a chance to change docstring (taken from
Yes there are 2 tests disabled for the I will add a warning when the parameter is reset at the boundary (as I also wondered whether a different parameter transformation may be better? But it is worth noting that it is enabled for the majority of tests with bound parameters! |
Since the last review:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Just a couple of typos in the docstring.
New features:
lm4
that support minimising scalars correctly and with optional bounds on parameterssw_fitpowder
(powder fitting class)vary
tosimplex
to specify which parameters to vary in the fit (provided this option inlm4
as well)Changes to behaviour:
cost_function_wrapper
(and minimisers which use it - e.g.simplex
andlm4
) will reset parameters out of bounds provided using same method aslsqnonlin
(previously would reset to be nearest boundary, but this didn't work well for the LM method).cost_function_wrapper
will now throw a warning when parameters have been reset inside the bounds.