passing a keyword parameter to model.fit() ignores added parameters #828
Replies: 7 comments 2 replies
-
@rolfverberg If I understand correctly, I would call that "intentional". I think if you had better initial values for the Parameters you would get a better fit ;). |
Beta Was this translation helpful? Give feedback.
-
What works is the following: Now I get the desired output (see init=1.1 for amp_ratio):[[Variables]] |
Beta Was this translation helpful? Give feedback.
-
@rolfverberg Again, If I understand correctly, I would call all of that "intentional". For sure, you have to set an initial value for each parameter. There is not ever a case for which a default value should be used. If you are not setting an initial value for every parameter, you are going to get bad fits. If you pass in a Parameters object and also pass in a value for a particular parameter to Model.fit() which one should take precedence? We chose "ignore the argument, use the value in the Parameters". |
Beta Was this translation helpful? Give feedback.
-
Um, what? I do not know what you mean by "map of curves", so maybe I'm not understanding you. If you want to state that you cannot run fits in a loop, I will not disagree with you. But lots of people do exactly that, and I do not understand why you cannot.
Do you mean something like
?
Is being a member of A model has a A model comprised of two Gaussians will have 6 elements in
But, of course, there will be additional parameters, such as |
Beta Was this translation helpful? Give feedback.
-
I guess I should have explained the goal more clearly, thanks for your patience. I'm collecting data on a synchrotron (say sort of xray) experiment. The data is an energy spectrum, intensity vs xray energy, so a simple 1D plot. However, I have an energy spectrum for a set of parameters that I can organize in a 2D map, so one spectrum for each point of the 2D map. Think of something like transmission data vs xray energy for a range of x and y positions on a 3D sample, with the xray beam in the z-direction. I now want to fit the 1D intensity vs xray energy spectra for each x and y position and there's a lot of them say 1000x1000. I first create a model and it may contain constraints between model parameters of the model components that are added as parameters in addition to the model components. This model is the same and shared for all the individual fits. I then fit all datasets with this model in parallel. I have a set of "safe" or "robust" initial parameters for all the model parameters (yes, decent initial parameters are crucial), the sum of the model component parameters and any constrained parameters added in addition to the model component parameters. I like that set to remain unchanged from fit to fit, so I can go back to it if a fit fails. I do however like to start each new with with a parameter set that is the solution of the previous fit it that fit succeeded (if not I stick with the "safe" initial ones, universal to all fits) Here is a abbreviated edited version of my loop:
` I like and appreciate the separation of I also can work around it easily. I can copy the initial parameters that I like to keep unmodified from the start for each fit, make any changes I like, feed the modified set to It took me a little by surprise and sent me digging in the code that model parameters that are part of Which is why I didn't phase it as a bug or error, it's a choice that's made and one that I can work with. To me the difference is not intuitive, but that doesn't make it right or wrong... I am very pleased with lmfit and grateful for the effort in creating and maintaining the package! |
Beta Was this translation helpful? Give feedback.
-
On Fri, Nov 11, 2022 at 9:32 AM Rolf Verberg ***@***.***> wrote:
I guess I should have explained the goal more clearly, thanks for your
patience. I'm collecting data on a synchrotron (say sort of xray)
experiment. The data is an energy spectrum, intensity vs xray energy, so a
simple 1D plot. However, I have an energy spectrum for a set of parameters
that I can organize in a 2D map, so one spectrum for each point of the 2D
map. Think of something like transmission data vs xray energy for a range
of x and y positions on a 3D sample, with the xray beam in the z-direction.
I now want to fit the 1D intensity vs xray energy spectra for each x and y
position and there's a lot of them say 1000x1000.
Yep, I work at a synchrotron beamline (doing X-ray Absorption and
Fluorescence spectroscopies). Lots of people using lmfit are synchrotron
people. In many ways, lmfit is a spin-off from XAS software.
I first create a model and it may contain constraints between model
parameters of the model components that are added as parameters in addition
to the model components. This model is the same and shared for all the
individual fits.
I then fit all datasets with this model in parallel.
Yep, that's all fine.
I have a set of "safe" or "robust" initial parameters for all the model
parameters (yes, decent initial parameters are crucial), the sum of the
model component parameters and any constrained parameters added in addition
to the model component parameters. I like that set to remain unchanged from
fit to fit, so I can go back to it if a fit fails. I do however like to
start each new with with a parameter set that is the solution of the
previous fit it that fit succeeded (if not I stick with the "safe" initial
ones, universal to all fits)
Sure, that seems fine. I might guess (and typically find) that using a
fixed set of "decent default initial values" is a bit more robust than
starting Fit N with the result of Fit N-1. But, I can certainly believe
that is not always the case.
Here is a abbreviated edited version of my loop:
`class FitMap():
def __init__(self, x, ymap, **kwargs):
self._x = np.asarray(x)
self._ymap = np.asarray(ymap)
self._model = None
self._parameters = Parameters()
def fit(self, **kwargs)
map_shape = self._ymap.shape
ij = [(i,j) for i in range(map_shape[0]) for j in range(map_shape[1])]
num = min(num_fit_per_proc, num_fit_per_batch)
with Parallel(n_jobs=num_proc) as parallel:
parallel(delayed(self._fit_parallel)(ij, num, n_start, **kwargs)
for n_start in range(0, len(ij), num))
def _fit_parallel(self, ij, num, n_start, **kwargs):
current_best_values = {}
for n in range(num):
i, j = ij[n_start+n]
kkwargs = {**kwargs, **current_best_values}
# Prevent current best values from sitting at boundaries
if len(current_best_values):
for name, value in current_best_values.items():
par = self._parameters[name]
kkwargs[name] = self._reset_pars_at_boundary(value)
result = self._model.fit(self._ymap[i,j], self._parameters, x=self._x, **kkwargs)
if result.success:
current_best_values = {par.name:par.value
for par in result.params.values() if par.vary}
else:
current_best_values = {}
# Collect results
`
I would recommend having one "initial parameters" (I think your
`self._parameters`) and possibly updating the values of that one object for
the next fit, that is replacing
if len(current_best_values):
for name, value in current_best_values.items():
par = self._parameters[name]
kkwargs[name] = self._reset_pars_at_boundary(value)
with
if len(current_best_values):
for name, value in current_best_values.items():
self._parameters[name].value =
self._reset_pars_at_boundary(value)
I would probably not use only `result.success` (which is a pretty generous
meaning of "success", essentially did the fit not abort) to tag "current
best values" but also check `result.errorbars` (which reports if
uncertainties were estimated). I would also want to check the value of the
fit statistic to help decide if the fit was "good", but that might need
fine-tuning for each problem.
It took me a little by surprise and sent me digging in the code that model
parameters that are part of self.param_names can be modified by keyword
arguments to Model.fit(), whereas any additional parameters that are part
of Parameters are ignored. For me they are functionally the same in the
concept of fitting (as long as they are actual fit parameters and not some
sort of auxiliary parameter).
Yeah, being able to set some parameter values by keyword argument is a
little odd. It is fragile, and incomplete in that you can set the value,
but not any other attributes of a Parameter. This ability might be called
a "wart" -- it's the kind of thing that makes the "simplest possible fit" a
tiny bit easier and then becomes a bad habit and trap. I sort of think
maybe we should fix all the examples that use this.
|
Beta Was this translation helpful? Give feedback.
-
A small world... :-) I think we're both on the same page. Thanks for the suggestions and discussion. And yes, the success output is indeed to be taken with a grain of salt (or a pound or more sometimes... :-)) In the actual code I have additional criteria. |
Beta Was this translation helpful? Give feedback.
-
Not sure if this is an issue that I would like to report or a conscious choice by the developers, but model.fit() seems to ignore a keyword parameter that's added to the model through an added parameter.
In the example below I try to fit data to two Gaussians. However, instead of the usual 6 free parameters, I want to have as free parameters the amplitude, center and sigma of the first peak, the center and the sigma of the second peak and the amplitude ratio between the first and the second peak. I initialize the amplitude ratio (to 1.0) when I add the parameter, but then I like to change it through a keyword argument in model.fit() (to 1.1). As you can see in the output the change to 1.1 is ignored as reflected in the warning that lmfit produces.
I don't know why it should be ignored when the parameter is passed to fit through parameters. In other words fit knows that it is a parameter even though it's not in the self.param_names set of model.
I'm using lmfit 1.0.1
Here is my code:
import numpy as np
from lmfit import Model, Parameters
from lmfit.models import GaussianModel
def gaussian(x, amplitude, center, sigma):
sig2 = 2.0sigmasigma
norm = sigmanp.sqrt(2.0np.pi)
return(amplitude*np.exp(-(x-center)**2/sig2)/norm)
amp_ratio = 1.35 #amp2 = amp1amp_ratio
amp1 = 7.0
cen1 = 1.0
sig1 = 1.2
amp2 = amp_ratioamp1
cen2 = 3.5
sig2 = 0.8
x = np.array(np.linspace(-6, 10, 501))
y = gaussian(x, amp1, cen1, sig1)+gaussian(x, amp2, cen2, sig2)
parameters = Parameters()
parameters.add('amp_ratio', value=1.0)
peak1 = GaussianModel(prefix='peak1')
new_parameters = peak1.make_params()
model = peak1
parameters += new_parameters
peak2 = GaussianModel(prefix='peak2')
new_parameters = peak2.make_params()
model += peak2
parameters += new_parameters
parameters['peak2amplitude'].set(expr='amp_ratio*peak1amplitude')
result = model.fit(y, parameters, x=x, amp_ratio=1.1)
print(result.fit_report(show_correl=False))
Output:
.../miniconda3/envs/parallel_sandbox/lib/python3.9/site-packages/lmfit/model.py:958: UserWarning: The keyword argument amp_ratio does not match any arguments of the model function. It will be ignored.
warnings.warn("The keyword argument %s does not " % name +
[[Model]]
(Model(gaussian, prefix='peak1') + Model(gaussian, prefix='peak2'))
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 133
# data points = 501
# variables = 6
chi-square = 5.5705e-29
reduced chi-square = 1.1254e-31
Akaike info crit = -35696.3167
Bayesian info crit = -35671.0170
[[Variables]]
amp_ratio: 1.35000000 +/- 1.2601e-16 (0.00%) (init = 1)
peak1amplitude: 7.00000000 +/- 3.9340e-16 (0.00%) (init = 1)
peak1center: 1.00000000 +/- 7.6484e-17 (0.00%) (init = 0)
peak1sigma: 1.20000000 +/- 6.5196e-17 (0.00%) (init = 1)
peak1fwhm: 2.82578400 +/- 1.5353e-16 (0.00%) == '2.3548200peak1sigma'
peak1height: 2.32716342 +/- 5.4527e-17 (0.00%) == '0.3989423peak1amplitude/max(2.220446049250313e-16, peak1sigma)'
peak2amplitude: 9.45000000 +/- 3.7141e-16 (0.00%) == 'amp_ratiopeak1amplitude'
peak2center: 3.50000000 +/- 2.5561e-17 (0.00%) (init = 0)
peak2sigma: 0.80000000 +/- 2.0096e-17 (0.00%) (init = 1)
peak2fwhm: 1.88385600 +/- 4.7322e-17 (0.00%) == '2.3548200peak2sigma'
peak2height: 4.71250592 +/- 1.1838e-16 (0.00%) == '0.3989423*peak2amplitude/max(2.220446049250313e-16, peak2sigma)'
Beta Was this translation helpful? Give feedback.
All reactions