Finding uncertainty on parameter of interest with non-parabolic likelihood #844
-
Hello, currently, I am trying to find a suitable library to perform some simple fits. I have a histogram for pseudo data (exactly following the Standard Model prediction) with a covariance matrix and a histogram for possible deviations (beyond the Standard Model contributions). Essentially, this amounts to a parameter inference where a likelihood can be specified as I would assume that I could perform such a fit with iminuit using an UnbinnedNLL cost function since I treat the bin contents as observations that should be compared with an array of means of "data_hist + a * a * deviations_hist". I tried to implement this in the following code, for illustrative purposes I start with a linear deviation (a * deviations_hist) because, in that case, it works well.
This returns a 68% interval of [-0.707, 0.707], which agrees well with my expectation from the argument that the (2 * negative log likelihood + 1)-contours should determine the 1-sigma interval of the parameter of interest: 2 * a^2 - 1 = 0 (we have two bins) leads to a_68% = +- 1/sqrt(2), which is approximately 0.707. However, when I specify a quadratic scaling of possible deviations by simply changing I tried to tune the initial value and the limits a bit and found that a=0.05 and limits from -1 to 1 yield a decent, but not very good result of best-fit of 0.05 with a down uncertainty of 1.05 and an up uncertainty of 1.03. Note that the analytic calculation of the boundaries of the 68% interval yields a_68% = +- (1/2)^(1/4) approx. 0.8408964. I am quite puzzled that the minimisation returns the given initial value of 0.05 as the best fit because the verbose printouts show that the FCN is smaller towards 0. I would appreciate any guidance on how to obtain the uncertainties / the upper limit on the "a" parameter when it enters quadratically. Thanks a lot! Best, |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 2 replies
-
You have to call If you have binned data (a histogram), you cannot use UnbinnedNLL, that is for unbinned samples. That you need to set There is currently no builtin cost function in iminuit.cost for your case, but it is easy to write a likelihood from scratch. You assume that the data is distributed according to a multivariate normal. So the log-likelihood is logL = mvnorm.logpdf(observation, prediction, covariance)), where observation and prediction are vectors, and covariance is a matrix. There is no sum here, because you have a single observation, the histogram. iminuit expects a negative log-likelihood, so your cost function looks like this: from iminuit import cost, Minuit
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import multivariate_normal as mvnorm
# likelihood has single free parameter "a"
def nll(a):
SM = np.ones(2)
BSM = np.ones(2)
cov = np.eye(2) # can also consider non-diagonal covariance matrices but use diagonal one for simplicity here
return -mvnorm.logpdf(SM, SM + a*BSM, cov)
m = Minuit(nll, a=0.0) # give an initial value of the best-fit that is known a-priori
m.limits["a"] = (-10,10) # we expect the limit roughly in this range, so it makes sense to specify the limit here
m.errordef = Minuit.LIKELIHOOD # required, nll is a negative-log-likelihood
m.migrad()
print(m)
m.draw_profile("a")
plt.xlim(-2,2)
plt.ylim(0,3)
plt.show() This produces the correct confidence interval of [-0.707, 0.707].
Yes, with from iminuit import cost, Minuit
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import multivariate_normal as mvnorm
# likelihood has single free parameter "a"
def nll(a):
SM = np.ones(2)
BSM = np.ones(2)
cov = np.eye(2) # can also consider non-diagonal covariance matrices but use diagonal one for simplicity here
return -mvnorm.logpdf(SM, SM + a*a*BSM, cov)
m = Minuit(nll, a=0.0) # give an initial value of the best-fit that is known a-priori
m.limits["a"] = (-10,10) # we expect the limit roughly in this range, so it makes sense to specify the limit here
m.errordef = Minuit.LIKELIHOOD
m.migrad()
m.minos()
print(m) For me, MINOS succeeds:
|
Beta Was this translation helpful? Give feedback.
-
@JaLuka98 If your question has been answered, please indicate that here on Github by clicking on the button. Thanks! |
Beta Was this translation helpful? Give feedback.
-
Ah sure, sorry, I just forgot. It helped me a lot. If I need any further help, I will open a new thread. Thank you very much! |
Beta Was this translation helpful? Give feedback.
You have to call
flatten
in your model, because the data you want to fit is 1D. The model prediction must match the data. You can also fit multivariate data with a multivariate model with iminuit.If you have binned data (a histogram), you cannot use UnbinnedNLL, that is for unbinned samples. That you need to set
m.errordef = Minuit.LEAST_SQUARES
is a consequence of the misuse of UnbinnedNLL, this would be wrong if you used UnbinnedNLL in its correct context. The correct likelihood for binned data is BinnedNLL, but you cannot use that either, because your histogram does not contain poisson-distributed counts but has bin-to-bin correlations which are captured by the cov matrix.There is cu…