-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Toys seems to give strange results #892
Comments
@darousso Thanks for your question and for using pyhf. As you're asking about a feature that hasn't been released yet with official support and isn't in the public API it would be helpful to get as much information as possible from you. Can you please give us the following information?
As we haven't made a release with pseudoexperiment generation support yet it would be good to understand this now so that if there are issues we can resolve them before we publicly release this. So your question is very important and we appreciate you bringing it up (also for being a bold beta tester 🚀). |
Thanks for your super quick response! I downloaded it via If I do
Python Notebook has been attached as a txt file since github is not letting me upload it (sorry). Thank you so much for helping look into this! And also thank you for developing pyhf (it is so much easier to use than HistFitter...)! |
Ah okay great. This actually has all the information we need, along with your notebook which you gave us. We can start debugging from here — thanks! This might take some time to better understand the actual behavior, but once we get the pseudoexperiment generation support into |
@darousso sorry, can you actually send us the notebook another way, or put it somewhere that we can access it through CERN? Unforunatley it seems to have gotten some of the cell formatting strangely messed up when I renamed the |
@matthewfeickert Sorry, I am not exactly sure how best to attach it, so I have just directly sent it to your CERN email. Let me know if that's alright! |
Perfect. Thanks! |
Okay, I've just ripped the spec out and made everything a runnable Python script that I put in this Gist. This can serve as a debugging source, but I am able to replicate your results with the spec and running with JAX as you were:
|
Our analysis is also looking at using toys with pyhf, so I'm interested in this answer as well! |
This will definitely be addressed along the way to the release of |
Out of curiosity, in what timeframe is v0.5.0 planned to be released? |
We don't have set timeframes, but it is a priority for us to get out. The 3 of us have been a bit swamped at the moment with preparing various software tutorials and conferences and there are some additional PRs that will need to go in as well before we can release |
Note: very small numbers so asymptotics may not set in yet. The systematic variations Is there a nice way to plot a histogram of the test statistic from the toys? Is there a nice way to plot the values of the parameters being used to generate the toys (vs. mu)? Does the aux data vary in the Toy generation? |
Hello, Sorry, am a bit confused with the roadmaps, has the fix been implemented in the current release? -David |
No, the related PR hasn't been merged.
This is scheduled for minor release |
Ah perfect, thanks! |
(Sorry to bother again, I have just learned that our analysis intends on requesting EB in December, would the minor release be out by then?) |
hi @darousso, thanks for checking in. Yes, that is very much our plan. We also now have an announcement mailing list that you can join to be updated on upcoming releases and general announcements: https://groups.google.com/group/pyhf-announcements/subscribe. If you join you'll be able to know a few days in advance of when you can expect |
hi @darousso, I think this is actually expected/understood. can you try to plot a brazil band for your setup? specificaclly with sufficiently high mu (say up to mu=5). At some point the upper sigmas should come down |
@darousso following up on what @lukasheinrich is pointing to, this is probably related to the discussion in PR #966. |
thanks @kratsg these look reasonable. I should create a "learn notebook" to explain what the issue is. in short, since half oof the test stat distribution under signal hypothesis is a delta function CL_s+b can be either 1.0 or in the inverval [0.5.,0.0].. so CLs values can exhibit this type of discontinuity in any case for an upper limit you're interested in the region where CLs ~0.05 so µ between 2 and 4 in the above plot |
hi @darousso, there'll be more explanation, but generally your setup seems ok. Using the additions in #1160 this is what your example looks like existing comparison between toys and yours (slightly apples to oranges) adjusting asymptotics to make apples to apples comparison |
the fixes in #1126 should also make these plots look nicer |
Hi everyone, thank you so much for doing this!! Really sorry for the late reply, have been trying my best to try playing around with things out on my side with the new stuff. Sorry, I am quite new to limit setting so I am a bit confused by the above discussion, I am assuming from the above that for the hypotest, the issue lies with the fact that curves are not constrained to be 1 at mu=0, which creates this saturation which is fixed with this "inclusive_pvalue=False" parameter in the learn_toys_funky branch? What does this parameter do? (The hypotest function won't accept the "clip" parameter for toys, I am assuming that is just for asymptotics?) Is the inclusive_pvalue the only thing I need to change in the new setup? |
Hi @kratsg and @matthewfeickert , thank you so much for all your help on this front, and my sincerest apologies for the late reply! I have tried running things with the latest pyhf version (as well as with the learn_toys_funky branch back in November when there was the inclusive_pvalue option), I definitely see an improvement to last time (as back then I couldn't even get an exclusion curve), but I am still noticing some strange behaviour with the toys calculator, as it looks like the upper sigma bands have collapsed and the experimental curve is hugging the expected. Would you happen to know what I may be doing incorrect? (I notice the clipping option is only available for asymptotics?)
Once again, my most sincerest apologies if I am screwing up something really simple... (I realize I shouldn't be putting ATLAS-specific plots here even if the results are already public. It is here: https://gitlab.cern.ch/drousso/dvjetsanalysis/-/blob/e9a82569b44a28b32d1f9668f3afc4f541a89641/DVMuonLimits/TestCondorMET_100000/Exclusion_Plot.png ) |
Dear @matthewfeickert et al, |
Hello, Sorry for bugging again, I just thought it may be more useful if I actually provided some example workspaces and a description of the behaviour in comparing pyhf and Histfitter in a couple slides. Is the asymmetric bands expected behaviour and this is instead a bug in HistFitter? -David |
Note to devs : This Issue needs to be the focus of |
@darousso With PR #1610 in I can try to revisit this in the back half of the week once I'm out of workshops. As was mentioned, the asymptotic approximaitons may not be valid here, so it probably makes sense to do comparison between HistFitter and pyhf pseudo-experiments and to reproduce @kratsg plots from earlier (#892 (comment)) with the updated code. |
Dear authors, this problem seems to persist in pyhf 0.6.3. Has there been any progress on it? |
This needs to get revisited for the |
Minimal reproducible example: import json
import matplotlib.pyplot as plt
import pyhf
from pyhf.contrib.viz import brazil
pyhf.set_backend('jax')
j = json.loads('{"channels": [{"name": "channel1", "samples": [{"data": [10.0, 0.2], "modifiers": [{"data": null, "name": "lumi", "type": "lumi"}, {"data": null, "name": "mu_sig", "type": "normfactor"}, {"data": {"hi": 1.1, "lo": 0.9}, "name": "bsm_uncerts", "type": "normsys"}], "name": "bsm_signal"}, {"data": [0.62, 1.94], "modifiers": [{"data": [0.31, 0.97], "name": "staterror_channel1", "type": "staterror"}, {"data": {"hi_data": [0.64, 2.033], "lo_data": [0.6, 1.847]}, "name": "non_linearity", "type": "histosys"}], "name": "bkg"}]}], "measurements": [{"config": {"parameters": [{"auxdata": [1.0], "bounds": [[0.5, 1.5]], "inits": [1.0], "name": "lumi", "sigmas": [0.017]}], "poi": "mu_sig"}, "name": "meas"}], "observations": [{"data": [0, 0], "name": "channel1"}], "version": "1.0.0"}')
ws = pyhf.Workspace(j)
model = ws.model()
data = ws.data(model)
poi_vals = [0.0, 0.1, 0.2, 0.3, 0.5, 1.0]
results = [
pyhf.infer.hypotest(
test_poi, data, model, test_stat="qtilde", return_expected_set=True, calctype='toybased', ntoys=10000
)
for test_poi in poi_vals
]
fig, ax = plt.subplots()
fig.set_size_inches(7, 5)
brazil.plot_results(poi_vals, results, ax=ax)
fig.savefig(f"analysis_example.pdf")
plt.close(fig) The resulting plot looks like the attached one. |
…and running the exact same example as above in the current dev version of pyhf, |
Dear @matthewfeickert, all, Thank you, |
c.f. Issue #1720 also. We have regression fixing to do. |
As an intermediate solution, you can switch your |
This should be currently fixed in |
We've gotten no confirmation. Given that nobody's run into issues with the fixes provided by @lukasheinrich this will get closed. |
Question
I have noticed that the toys hypotest seems to give some strange values even if the asymptotics gives regular looking values.
In general the cls_exp upper sigma values seem to be all saturated at 1 for all signal points, and this does not seem to improve when increasing the number of toys.
Relevant Issues and Pull Requests
An example signal point model is the following:
The following results are obtained:
Do let me know if you need more information about environment or the actual files I am running (it is in an ipynb)
The text was updated successfully, but these errors were encountered: