passing a keyword parameter to model.fit() ignores added parameters #828

rolfverberg · 2022-11-10T19:10:44Z

rolfverberg
Nov 10, 2022

Not sure if this is an issue that I would like to report or a conscious choice by the developers, but model.fit() seems to ignore a keyword parameter that's added to the model through an added parameter.

In the example below I try to fit data to two Gaussians. However, instead of the usual 6 free parameters, I want to have as free parameters the amplitude, center and sigma of the first peak, the center and the sigma of the second peak and the amplitude ratio between the first and the second peak. I initialize the amplitude ratio (to 1.0) when I add the parameter, but then I like to change it through a keyword argument in model.fit() (to 1.1). As you can see in the output the change to 1.1 is ignored as reflected in the warning that lmfit produces.

I don't know why it should be ignored when the parameter is passed to fit through parameters. In other words fit knows that it is a parameter even though it's not in the self.param_names set of model.

I'm using lmfit 1.0.1

Here is my code:

import numpy as np

from lmfit import Model, Parameters
from lmfit.models import GaussianModel

def gaussian(x, amplitude, center, sigma):
sig2 = 2.0sigmasigma
norm = sigmanp.sqrt(2.0np.pi)
return(amplitude*np.exp(-(x-center)**2/sig2)/norm)

amp_ratio = 1.35 #amp2 = amp1amp_ratio
amp1 = 7.0
cen1 = 1.0
sig1 = 1.2
amp2 = amp_ratioamp1
cen2 = 3.5
sig2 = 0.8

x = np.array(np.linspace(-6, 10, 501))
y = gaussian(x, amp1, cen1, sig1)+gaussian(x, amp2, cen2, sig2)

parameters = Parameters()

parameters.add('amp_ratio', value=1.0)

peak1 = GaussianModel(prefix='peak1')
new_parameters = peak1.make_params()
model = peak1
parameters += new_parameters

peak2 = GaussianModel(prefix='peak2')
new_parameters = peak2.make_params()
model += peak2
parameters += new_parameters
parameters['peak2amplitude'].set(expr='amp_ratio*peak1amplitude')

result = model.fit(y, parameters, x=x, amp_ratio=1.1)
print(result.fit_report(show_correl=False))

Output:

.../miniconda3/envs/parallel_sandbox/lib/python3.9/site-packages/lmfit/model.py:958: UserWarning: The keyword argument amp_ratio does not match any arguments of the model function. It will be ignored.
warnings.warn("The keyword argument %s does not " % name +
[[Model]]
(Model(gaussian, prefix='peak1') + Model(gaussian, prefix='peak2'))
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 133
# data points = 501
# variables = 6
chi-square = 5.5705e-29
reduced chi-square = 1.1254e-31
Akaike info crit = -35696.3167
Bayesian info crit = -35671.0170
[[Variables]]
amp_ratio: 1.35000000 +/- 1.2601e-16 (0.00%) (init = 1)
peak1amplitude: 7.00000000 +/- 3.9340e-16 (0.00%) (init = 1)
peak1center: 1.00000000 +/- 7.6484e-17 (0.00%) (init = 0)
peak1sigma: 1.20000000 +/- 6.5196e-17 (0.00%) (init = 1)
peak1fwhm: 2.82578400 +/- 1.5353e-16 (0.00%) == '2.3548200peak1sigma'
peak1height: 2.32716342 +/- 5.4527e-17 (0.00%) == '0.3989423peak1amplitude/max(2.220446049250313e-16, peak1sigma)'
peak2amplitude: 9.45000000 +/- 3.7141e-16 (0.00%) == 'amp_ratiopeak1amplitude'
peak2center: 3.50000000 +/- 2.5561e-17 (0.00%) (init = 0)
peak2sigma: 0.80000000 +/- 2.0096e-17 (0.00%) (init = 1)
peak2fwhm: 1.88385600 +/- 4.7322e-17 (0.00%) == '2.3548200peak2sigma'
peak2height: 4.71250592 +/- 1.1838e-16 (0.00%) == '0.3989423*peak2amplitude/max(2.220446049250313e-16, peak2sigma)'

newville · 2022-11-10T20:25:46Z

newville
Nov 10, 2022
Maintainer

@rolfverberg If I understand correctly, I would call that "intentional". amp_ratio is a Parameter, because you added it to your parameters. It would be varied during the fit, and will be used to constrain peak2amplitude. Since it is already included in parameters, there's no reason to pass it explicitly into Model.fit -- it's already included in the Parameters, and is not an independent variable.

I think if you had better initial values for the Parameters you would get a better fit ;).

1 reply

rolfverberg Nov 10, 2022
Author

Correct. The issue is that I would like to pass an initial guess to model.fit() and I can't. I can only set the initial value when I add the parameter to the model. In fact, if I don't give it an initial value, it will be inf and the fit will fail.
The application I ask it for is if I want to fit a map of curves and use the solution of a point on the map as the initial value for the next point on the map, I need to be able to change it as a parameter to model.fit(), not only when I define the model.

rolfverberg · 2022-11-10T20:39:01Z

rolfverberg
Nov 10, 2022
Author

What works is the following:
Insert after line 953 in model.py (version 1.0.1):
for name in param_kwargs:
p = kwargs[name]
if isinstance(p, Parameter):
p.name = name # allows N=Parameter(value=5) with implicit name
params[name] = deepcopy(p)
else:
params[name].set(value=p)
del kwargs[name]

Now I get the desired output (see init=1.1 for amp_ratio):

[[Variables]]
amp_ratio: 1.35000000 +/- 1.5088e-15 (0.00%) (init = 1.1)
peak1amplitude: 7.00000000 +/- 4.7103e-15 (0.00%) (init = 1)
peak1center: 1.00000000 +/- 9.1579e-16 (0.00%) (init = 0)
peak1sigma: 1.20000000 +/- 7.8063e-16 (0.00%) (init = 1)
peak1fwhm: 2.82578400 +/- 1.8382e-15 (0.00%) == '2.3548200peak1sigma'
peak1height: 2.32716342 +/- 6.5288e-16 (0.00%) == '0.3989423peak1amplitude/max(2.220446049250313e-16, peak1sigma)'
peak2amplitude: 9.45000000 +/- 4.4471e-15 (0.00%) == 'amp_ratiopeak1amplitude'
peak2center: 3.50000000 +/- 3.0606e-16 (0.00%) (init = 0)
peak2sigma: 0.80000000 +/- 2.4062e-16 (0.00%) (init = 1)
peak2fwhm: 1.88385600 +/- 5.6661e-16 (0.00%) == '2.3548200peak2sigma'
peak2height: 4.71250592 +/- 1.4174e-15 (0.00%) == '0.3989423*peak2amplitude/max(2.220446049250313e-16, peak2sigma)'

0 replies

newville · 2022-11-10T20:59:49Z

newville
Nov 10, 2022
Maintainer

@rolfverberg Again, If I understand correctly, I would call all of that "intentional". For sure, you have to set an initial value for each parameter. There is not ever a case for which a default value should be used. If you are not setting an initial value for every parameter, you are going to get bad fits.

If you pass in a Parameters object and also pass in a value for a particular parameter to Model.fit() which one should take precedence? We chose "ignore the argument, use the value in the Parameters".

1 reply

rolfverberg Nov 10, 2022
Author

Sure enough you have to set an initial value, no disagreement there, the question is where and how flexible this is in lmfit. The way it is means that I can't run the model in a loop over a map of curves.

Not saying it's the wrong choice, I never had a need for it before this, but it could be useful to have a means through a parameter at model.fit() to overrule a given initial value. After all, contrary to what you say that it is not the choice that was made, it's the way it works with all the other model parameter as you can see here in this code segment:

    # If any kwargs match parameter names, override params.
    param_kwargs = set(kwargs.keys()) & set(self.param_names)
    for name in param_kwargs:
        p = kwargs[name]
        if isinstance(p, Parameter):
            p.name = name  # allows N=Parameter(value=5) with implicit name
            params[name] = deepcopy(p)
        else:
            params[name].set(value=p)
        del kwargs[name]

The issue is that parameters that are added through Parameter.add() are not part of self.param_names even though they are model parameters just like the others. Sure enough that only happens if you have parameter constraints like in this example, but being able to do that is a significant plus of lmfit!

newville · 2022-11-11T01:44:57Z

newville
Nov 11, 2022
Maintainer

@rolfverberg

Sure enough you have to set an initial value, no disagreement there, the question is where and how flexible this is in lmfit. The way it is means that I can't run the model in a loop over a map of curves.

Um, what? I do not know what you mean by "map of curves", so maybe I'm not understanding you. If you want to state that you cannot run fits in a loop, I will not disagree with you. But lots of people do exactly that, and I do not understand why you cannot.

Not saying it's the wrong choice, I never had a need for it before this, but it could be useful to have a means through a parameter at model.fit() to overrule a given initial value

Do you mean something like

import numpy as np
from lmfit import Parameters
params = Parameters()
params.add('center', 1500, min=900, max=2100)
params.add('amplitude', 25, min=0)
params.add('sigma', 2, min=0)

results = {}
for cval in np.linspace(1000, 2000, 21):
    params['center'].value = cval
    results[f'cen_{cval:.1f}'] = model.fit(params, x=x).

?

The issue is that parameters that are added through Parameter.add() are not part of self.param_names even though they are model parameters just like the others

Is being a member of self.param_names important to you?

A model has a param_names attribute, which is the list of required parameter names for the model to be valid - the ones defined in the signature of the model function. You can certainly add more if you want. And you can use the Parameters you add in constraint expressions. That is precisely what you have done.

A model comprised of two Gaussians will have 6 elements in param_names:

from lmfit.models import GaussianModel
mod = GaussianModel(prefix='a_') + GaussianModel(prefix='b_') 
print(mod.param_names)
# ['a_amplitude', 'a_center', 'a_sigma', 'b_amplitude', 'b_center', 'b_sigma']

But, of course, there will be additional parameters, such as a_fwhm. The separation of Parameters from Model is actually one of the central design choices for lmfit.

0 replies

rolfverberg · 2022-11-11T15:32:23Z

rolfverberg
Nov 11, 2022
Author

I guess I should have explained the goal more clearly, thanks for your patience. I'm collecting data on a synchrotron (say sort of xray) experiment. The data is an energy spectrum, intensity vs xray energy, so a simple 1D plot. However, I have an energy spectrum for a set of parameters that I can organize in a 2D map, so one spectrum for each point of the 2D map. Think of something like transmission data vs xray energy for a range of x and y positions on a 3D sample, with the xray beam in the z-direction.

I now want to fit the 1D intensity vs xray energy spectra for each x and y position and there's a lot of them say 1000x1000.

I first create a model and it may contain constraints between model parameters of the model components that are added as parameters in addition to the model components. This model is the same and shared for all the individual fits.

I then fit all datasets with this model in parallel.

I have a set of "safe" or "robust" initial parameters for all the model parameters (yes, decent initial parameters are crucial), the sum of the model component parameters and any constrained parameters added in addition to the model component parameters. I like that set to remain unchanged from fit to fit, so I can go back to it if a fit fails. I do however like to start each new with with a parameter set that is the solution of the previous fit it that fit succeeded (if not I stick with the "safe" initial ones, universal to all fits)

Here is a abbreviated edited version of my loop:
`class FitMap():

def __init__(self, x, ymap, **kwargs):
    self._x = np.asarray(x)
    self._ymap = np.asarray(ymap)
    self._model = None
    self._parameters = Parameters()

def fit(self, **kwargs)
    map_shape = self._ymap.shape
    ij = [(i,j) for i in range(map_shape[0]) for j in range(map_shape[1])]
    num = min(num_fit_per_proc, num_fit_per_batch)
    with Parallel(n_jobs=num_proc) as parallel:
        parallel(delayed(self._fit_parallel)(ij, num, n_start, **kwargs)
            for n_start in range(0, len(ij), num))

def _fit_parallel(self, ij, num, n_start, **kwargs):
    current_best_values = {}
    for n in range(num):
        i, j = ij[n_start+n]
        kkwargs = {**kwargs, **current_best_values}
        # Prevent current best values from sitting at boundaries
        if len(current_best_values):
            for name, value in current_best_values.items():
                par = self._parameters[name]
                kkwargs[name] = self._reset_pars_at_boundary(value)
        result = self._model.fit(self._ymap[i,j], self._parameters, x=self._x, **kkwargs)
        if result.success:
            current_best_values = {par.name:par.value
                    for par in result.params.values() if par.vary}
        else:
            current_best_values = {}
        # Collect results

`

I like and appreciate the separation of Model and Parameters and being a member of self.param_names is not important to me, in fact the choices that are made make perfect sense and I can see why they are made.

I also can work around it easily. I can copy the initial parameters that I like to keep unmodified from the start for each fit, make any changes I like, feed the modified set to Model.fit() as params, just like you do in your example, et voila, all set. So I didn't mean to say I can't do it. It really only comes down to I can't do it through a keyword argument.

It took me a little by surprise and sent me digging in the code that model parameters that are part of self.param_names can be modified by keyword arguments to Model.fit(), whereas any additional parameters that are part of Parameters are ignored. For me they are functionally the same in the concept of fitting (as long as they are actual fit parameters and not some sort of auxiliary parameter).

Which is why I didn't phase it as a bug or error, it's a choice that's made and one that I can work with. To me the difference is not intuitive, but that doesn't make it right or wrong... I am very pleased with lmfit and grateful for the effort in creating and maintaining the package!

0 replies

newville · 2022-11-12T00:16:58Z

newville
Nov 12, 2022
Maintainer

On Fri, Nov 11, 2022 at 9:32 AM Rolf Verberg ***@***.***> wrote: I guess I should have explained the goal more clearly, thanks for your patience. I'm collecting data on a synchrotron (say sort of xray) experiment. The data is an energy spectrum, intensity vs xray energy, so a simple 1D plot. However, I have an energy spectrum for a set of parameters that I can organize in a 2D map, so one spectrum for each point of the 2D map. Think of something like transmission data vs xray energy for a range of x and y positions on a 3D sample, with the xray beam in the z-direction. I now want to fit the 1D intensity vs xray energy spectra for each x and y position and there's a lot of them say 1000x1000.

Yep, I work at a synchrotron beamline (doing X-ray Absorption and Fluorescence spectroscopies). Lots of people using lmfit are synchrotron people. In many ways, lmfit is a spin-off from XAS software. I first create a model and it may contain constraints between model

parameters of the model components that are added as parameters in addition to the model components. This model is the same and shared for all the individual fits. I then fit all datasets with this model in parallel.

Yep, that's all fine. I have a set of "safe" or "robust" initial parameters for all the model

parameters (yes, decent initial parameters are crucial), the sum of the model component parameters and any constrained parameters added in addition to the model component parameters. I like that set to remain unchanged from fit to fit, so I can go back to it if a fit fails. I do however like to start each new with with a parameter set that is the solution of the previous fit it that fit succeeded (if not I stick with the "safe" initial ones, universal to all fits)

Sure, that seems fine. I might guess (and typically find) that using a fixed set of "decent default initial values" is a bit more robust than starting Fit N with the result of Fit N-1. But, I can certainly believe that is not always the case. Here is a abbreviated edited version of my loop:

`class FitMap(): def __init__(self, x, ymap, **kwargs): self._x = np.asarray(x) self._ymap = np.asarray(ymap) self._model = None self._parameters = Parameters() def fit(self, **kwargs) map_shape = self._ymap.shape ij = [(i,j) for i in range(map_shape[0]) for j in range(map_shape[1])] num = min(num_fit_per_proc, num_fit_per_batch) with Parallel(n_jobs=num_proc) as parallel: parallel(delayed(self._fit_parallel)(ij, num, n_start, **kwargs) for n_start in range(0, len(ij), num)) def _fit_parallel(self, ij, num, n_start, **kwargs): current_best_values = {} for n in range(num): i, j = ij[n_start+n] kkwargs = {**kwargs, **current_best_values} # Prevent current best values from sitting at boundaries if len(current_best_values): for name, value in current_best_values.items(): par = self._parameters[name] kkwargs[name] = self._reset_pars_at_boundary(value) result = self._model.fit(self._ymap[i,j], self._parameters, x=self._x, **kkwargs) if result.success: current_best_values = {par.name:par.value for par in result.params.values() if par.vary} else: current_best_values = {} # Collect results `

I would recommend having one "initial parameters" (I think your `self._parameters`) and possibly updating the values of that one object for the next fit, that is replacing if len(current_best_values): for name, value in current_best_values.items(): par = self._parameters[name] kkwargs[name] = self._reset_pars_at_boundary(value) with if len(current_best_values): for name, value in current_best_values.items(): self._parameters[name].value = self._reset_pars_at_boundary(value) I would probably not use only `result.success` (which is a pretty generous meaning of "success", essentially did the fit not abort) to tag "current best values" but also check `result.errorbars` (which reports if uncertainties were estimated). I would also want to check the value of the fit statistic to help decide if the fit was "good", but that might need fine-tuning for each problem. It took me a little by surprise and sent me digging in the code that model

parameters that are part of self.param_names can be modified by keyword arguments to Model.fit(), whereas any additional parameters that are part of Parameters are ignored. For me they are functionally the same in the concept of fitting (as long as they are actual fit parameters and not some sort of auxiliary parameter).

Yeah, being able to set some parameter values by keyword argument is a little odd. It is fragile, and incomplete in that you can set the value, but not any other attributes of a Parameter. This ability might be called a "wart" -- it's the kind of thing that makes the "simplest possible fit" a tiny bit easier and then becomes a bad habit and trap. I sort of think maybe we should fix all the examples that use this.

0 replies

rolfverberg · 2022-11-14T18:06:42Z

rolfverberg
Nov 14, 2022
Author

A small world... :-)

I think we're both on the same page. Thanks for the suggestions and discussion.

And yes, the success output is indeed to be taken with a grain of salt (or a pound or more sometimes... :-)) In the actual code I have additional criteria.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lmfit

passing a keyword parameter to model.fit() ignores added parameters #828

{{title}}

Replies: 7 comments 2 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

lmfit

passing a keyword parameter to model.fit() ignores added parameters #828

rolfverberg Nov 10, 2022

Here is my code:

Output:

Replies: 7 comments · 2 replies

newville Nov 10, 2022 Maintainer

rolfverberg Nov 10, 2022 Author

rolfverberg Nov 10, 2022 Author

Now I get the desired output (see init=1.1 for amp_ratio):

newville Nov 10, 2022 Maintainer

rolfverberg Nov 10, 2022 Author

newville Nov 11, 2022 Maintainer

rolfverberg Nov 11, 2022 Author

newville Nov 12, 2022 Maintainer

rolfverberg Nov 14, 2022 Author

rolfverberg
Nov 10, 2022

Replies: 7 comments 2 replies

newville
Nov 10, 2022
Maintainer

rolfverberg Nov 10, 2022
Author

rolfverberg
Nov 10, 2022
Author

newville
Nov 10, 2022
Maintainer

rolfverberg Nov 10, 2022
Author

newville
Nov 11, 2022
Maintainer

rolfverberg
Nov 11, 2022
Author

newville
Nov 12, 2022
Maintainer

rolfverberg
Nov 14, 2022
Author