-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add staterror config to spec to have faithful roundtrip #760
Comments
@kratsg pointed out that |
related: #662 |
Similarly to stat errors, |
I see this has been kicked back several versions, is there any estimate of when it might become available? A related question: Do I understand correctly that if one wants to use a Poisson constraint, it can simply be specified as a |
This is currently slated for
While they are both multiplicative modifiers, in addition to the difference of the implementation between the Poisson and Normal constraint p.d.f. (c.f. the "Modifiers and Constraints" table in the docs, though it seems you already have), the method of determining the number of nuisance parameters that are allocated per sample for each modifier are different. So that's something to be aware of. |
Thanks, that answers my question. It's not super urgent, since it's safe to use Out of curiosity, why is a Gaussian used for MC stat uncertainties? Since they're statistical in nature a Poisson should be more accurate, particularly in cases with large statistical uncertainties, right? The reason I got into asking these questions was because I was having issues with |
Just a gentle reminder on this, since it seems it's been pushed pack another major(?) version now. My analysis is now actually running into some limitations with the number of NPs, so it would be great to be able to combine Poisson Again, why does |
Hi @balunas, I can't comment on the time line, but maybe a few comments about the other points: I think the use of Gaussian as default came from the HistFactory implementation in ROOT, where this was the default choice before ROOT 6.22. I agree that the Poisson makes more sense as a default, at least when you have a uniform weight distribution. If you do find that the choice of constraint term makes a big difference, then it may be worth revisiting the model setup. Large MC statistical uncertainties can bias your signal extraction (link to ATLAS-internal slides with a study). How are you building your workspace at the moment? I believe you should technically be able to correlate If MC statistical uncertainties are problematic, it may also be interesting to consider re-binning or merging samples, which can address this type of issue more generally. |
The drawback of just using There will probably be arguments against this, but I propose that Poisson be made the default for |
If you use the same parameter names, the modifiers are going to be correlated and you will end up with a single NP. I do not know whether there is internal optimization in Related to large MC uncertainties: this bias is not only an issue when scaling both signal and backgrounds together, but it also can appear with the more careful split treatment. When applying the same NP to both signal and background, the situation can certainly be worse though when pulls happen due to low background MC statistics and the effect is propagated to a (presumably high-statistic) signal. I completely agree with you that Poisson should be the default, for the arguments you mention and for consistency with ROOT. |
Yes and no. We decided to not do a
What @alexander-held has here is correct. I don't think that there's any reasons to argue against changing it, so to help that along please make a new Issue for that. |
Maybe to be practical, I think there are multiple aspects to this:
Changing the default without implementing a way to switch behavior might not be ideal, as it will change the results of people who are used to the Gaussian setup. I think it might be possible to make this configurable though without already involving the JSON spec, and only do so in a separate step (assuming that this is maybe easier to make some partial progress). The natural way to configure this to me would be |
hi, i appreciate the discussion here. we do generally have limited time and I do think this is prioritized enough that we want to make sure it gets into the upcoming major release. What I propose we'll do (I have not discussed this with the core developers) is to support a user-friendly interface through That said, it is possible to make this functional today in your existing code like so: import pyhf
import json
ws = pyhf.Workspace(json.load(open('3b_tag21.2.27-1_RW_ExpSyst_79800_multibin_excl_Gtt_2400_5000_800.json')))
pdf = ws.model()
print(pdf.config.param_set('staterror_SR1L_Inj_Lmeff_cuts'))
def to_poisson(func):
def wrapper(*args, **kwargs):
result = func(*args, **kwargs)
result['paramset_type'] = pyhf.parameters.constrained_by_poisson
return result
return wrapper
pyhf.modifiers.staterror.required_parset = to_poisson(pyhf.modifiers.staterror.required_parset)
pdf = ws.model()
print(pdf.config.param_set('staterror_SR1L_Inj_Lmeff_cuts')) and this seems to work in being able to change things on a global-level (which I propose that
This does also allow you to change constraints for other modifier types in a global way in a similar fashion ... To summarize
|
In the current HEAD of import pyhf
required_parset = pyhf.modifiers.staterror.required_parset
import json
ws = pyhf.Workspace(json.load(open('3b_tag21.2.27-1_RW_ExpSyst_79800_multibin_excl_Gtt_2400_5000_800.json')))
pdf = ws.model()
print(pdf.config.param_set('staterror_SR1L_Inj_Lmeff_cuts'))
def to_poisson(func):
def wrapper(*args, **kwargs):
result = required_parset(*args, **kwargs)
result['paramset_type'] = 'constrained_by_poisson'
result['factors'] = result.pop('sigmas')
return result
return wrapper
pyhf.modifiers.staterror.required_parset = to_poisson(pyhf.modifiers.staterror.required_parset)
pdf = ws.model()
print(pdf.config.param_set('staterror_SR1L_Inj_Lmeff_cuts')) |
Related to roundtrips, an adversarial example with multiple |
Description
we should add
staterrorconfig
as a channel property. Currently most analyses usePoisson
as the config, which is not the default and thus needs to be specificed in the XML.Converting XML -> JSON -> XML loses this information since we never record this non-default value in the JSON and thus cannot reproduce the original XML
during parsing, the staterrorconfig should become a value in the paramsets used by the staterror modifier.
The text was updated successfully, but these errors were encountered: