fix: pt variation change carried through signal region calculations #163

ekauffma · 2023-06-23T09:01:03Z

Addresses 3rd point in #162 (pt variations are carried through by overwriting jet object)

analyses/cms-open-data-ttbar/ttbar_analysis_pipeline.py

alexander-held

As a sanity check, could you please compare output histograms to https://github.com/eguiraud/analysis-grand-challenge-coffea/tree/correct-outputs (single file should be enough)? That one is built from the correctionlib version with back-ported W+jets scale variation fix.

ekauffma · 2023-06-26T15:15:59Z

Documented discrepancies here:
compare_result (1).txt

Seems to be some discrepancy due to floating point error.

alexander-held · 2023-06-26T17:38:02Z

That's interesting, looking at the first example

got      [43415.260890439, 90524.50293484158, 124669.85937904267, 78590.96671041285, 51739.84456109777, 36289.230414965794, 25796.08326822223, 20466.19299444227, 15611.67260972688, 12753.150585139763, 10134.699515096338, 8746.319925277914, 54751.576864877905]
expected [43415.260890439, 90524.50293484157, 124668.95084443182, 78590.93441945997, 51740.78538666148, 36289.230414965794, 25796.08326822223, 20466.19299444227, 15611.67260972688, 12753.150585139763, 10134.699515096338, 8746.319925277914, 54751.576864877905]

it looks like there are just some events migrating between bins. For example, these three bins

got 124669.85937904267, 78590.96671041285, 51739.84456109777
expected 124668.95084443182, 78590.93441945997, 51740.78538666148

are presumably caused by two events moving between the outer and inner bin. From a quick look at other cases, that looks like it generally explains the observed behavior. It should be fine to move forward with this then. Thanks for checking!

alexander-held · 2023-06-26T22:07:27Z

analyses/cms-open-data-ttbar/ttbar_analysis_pipeline.py

+                         "mass": selected_jets_region.mass,
+                         "btagCSVV2": selected_jets_region.btagCSVV2},
+                        with_name="PtEtaPhiMLorentzVector"
+                    )

                    # reconstruct hadronic top as bjj system with largest pT
                    # the jet energy scale / resolution effect is not propagated to this observable at the moment


This line needs to be removed, as this update takes care of propagating things correctly.

alexander-held · 2023-06-26T22:25:18Z

I've looked into this a bit, code for reference is in https://gist.github.com/alexander-held/d44d089b7a71f25bae17a03665325400. This compares the correctionlib-based version from @eguiraud's branch to this PR (without correctionlib). I'm essentially only looking at nominal ttbar, single file, and a zoomed in histogram with bin edges [189.999, 190.0, 190.001]. I am printing out values of the 4j2b observable that falls within that window with 190 GeV subtracted from it.

this PR:

-0.0002548726
0.0003070348
0.0001955073
0.0003590095

resulting in bin yields of [3.69506563 3.69506563]

correctionlib version:

0.0000000000
0.0000000000
0.0000000000
0.0000000000

resulting in bin yields of [1.84753282 5.54259845]

The differences are caused by events migrating across the boundary. What is not clear to me is why all the events in the correctionlib version seem to be ending up at exactly 190 GeV.

alexander-held · 2023-06-26T22:45:14Z

Looking closer into types, this PR ends up with observable being

<Array [367, 245, 383, 627, ... 352, 236, 617] type='48939 * ?float64'>

while the correctionlib version has

<Array [517, 229, 192, ... 262, 3.84e+03, 198] type='48881 * ?float32'>

I guess the difference here comes down to 32 vs 64bit floats, not sure what exactly causes the difference in setups.

edit: The conversion to 64bit appears via the ak.zip(..., with_name="PtEtaPhiMLorentzVector") as far as I can tell.

alexander-held

Looks good, thank you!

pt variation change carried through signal region calculations

e9e8b62

alexander-held reviewed Jun 26, 2023

View reviewed changes

analyses/cms-open-data-ttbar/ttbar_analysis_pipeline.py Outdated Show resolved Hide resolved

alexander-held reviewed Jun 26, 2023

View reviewed changes

This was referenced Jun 26, 2023

Path towards v1.1 #162

Closed

Understand scale variation changes via correctionlib implementation #131

Closed

alexander-held reviewed Jun 26, 2023

View reviewed changes

alexander-held mentioned this pull request Jun 27, 2023

Add a way to validate output histograms against a trusted reference #136

Closed

ekauffma added 2 commits June 27, 2023 14:03

changed ordering of filters

19e0e07

removed outdated comment

7ab17bf

alexander-held approved these changes Jun 27, 2023

View reviewed changes

alexander-held merged commit b9407e6 into iris-hep:agc-v1 Jun 27, 2023

alexander-held mentioned this pull request Jun 28, 2023

Histogram validation and bin migrations #168

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: pt variation change carried through signal region calculations #163

fix: pt variation change carried through signal region calculations #163

ekauffma commented Jun 23, 2023

alexander-held left a comment

ekauffma commented Jun 26, 2023

alexander-held commented Jun 26, 2023

alexander-held Jun 26, 2023

alexander-held commented Jun 26, 2023 •

edited

Loading

alexander-held commented Jun 26, 2023 •

edited

Loading

alexander-held left a comment

fix: pt variation change carried through signal region calculations #163

fix: pt variation change carried through signal region calculations #163

Conversation

ekauffma commented Jun 23, 2023

alexander-held left a comment

Choose a reason for hiding this comment

ekauffma commented Jun 26, 2023

alexander-held commented Jun 26, 2023

alexander-held Jun 26, 2023

Choose a reason for hiding this comment

alexander-held commented Jun 26, 2023 • edited Loading

alexander-held commented Jun 26, 2023 • edited Loading

alexander-held left a comment

Choose a reason for hiding this comment

alexander-held commented Jun 26, 2023 •

edited

Loading

alexander-held commented Jun 26, 2023 •

edited

Loading