Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First RooFit AD integration #1019

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
89 changes: 89 additions & 0 deletions data/ci/template-analysis_shapeInterp_womcstats.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Datacard produced by CombineHarvester with git status: 8fe0e3c-dirty
imax 1 number of bins
jmax 6 number of processes minus 1
kmax * number of nuisance parameters
--------------------------------------------------------------------------------
shapes * htt_tt_9_13TeV htt_input.root htt_tt_9_13TeV/$PROCESS htt_tt_9_13TeV/$PROCESS_$SYSTEMATIC
shapes bbH htt_tt_9_13TeV htt_input.root htt_tt_9_13TeV/bbH$MASS htt_tt_9_13TeV/bbH$MASS_$SYSTEMATIC
shapes ggH htt_tt_9_13TeV htt_input.root htt_tt_9_13TeV/ggH$MASS htt_tt_9_13TeV/ggH$MASS_$SYSTEMATIC
--------------------------------------------------------------------------------
bin htt_tt_9_13TeV
observation 3416.0
--------------------------------------------------------------------------------
bin htt_tt_9_13TeV htt_tt_9_13TeV htt_tt_9_13TeV htt_tt_9_13TeV htt_tt_9_13TeV htt_tt_9_13TeV htt_tt_9_13TeV
process ZL TTT VVT ZTT jetFakes ggH bbH
process 1 2 3 4 5 -1 0
rate 37.5448 683.017 96.5185 742.649 2048.94 19.9504 198.521
--------------------------------------------------------------------------------
CMS_eff_b_13TeV lnN - 0.99/1.01 0.98/1.01 0.98/1.02 - 0.99/1.01 0.98/1.02
CMS_eff_m lnN - - - 0.96 - - -
CMS_eff_t_13TeV lnN - 1.08 1.08 1.08 - 1.08 1.08
CMS_eff_t_mssmHigh_tt_13TeV shape - 1 1 1 - 1 1
CMS_eff_t_tt_13TeV lnN - 1.092 1.092 1.092 - 1.092 1.092
CMS_fake_b_13TeV lnN 0.99/1.05 - 0.99/1.01 0.97/1.02 - 0.97/1.03 -
CMS_htt_dyShape_scale_m_13TeV shape - - - 1 - - -
CMS_htt_dyShape_stat_m400pt0_13TeV shape - - - 1 - - -
CMS_htt_dyShape_stat_m400pt40_13TeV shape - - - 1 - - -
CMS_htt_dyShape_stat_m400pt80_13TeV shape - - - 1 - - -
CMS_htt_dyShape_tjXsec_13TeV shape - - - 1 - - -
CMS_htt_eFakeTau_loose_13TeV lnN 1.03 - - - - - -
CMS_htt_mFakeTau_loose_13TeV lnN 1.05 - - - - - -
CMS_htt_tt_btag_13TeV_TTT_bin_10 shape - 1 - - - - -
CMS_htt_tt_btag_13TeV_TTT_bin_11 shape - 1 - - - - -
CMS_htt_tt_btag_13TeV_TTT_bin_12 shape - 1 - - - - -
CMS_htt_tt_btag_13TeV_TTT_bin_13 shape - 1 - - - - -
CMS_htt_tt_btag_13TeV_TTT_bin_14 shape - 1 - - - - -
CMS_htt_tt_btag_13TeV_TTT_bin_15 shape - 1 - - - - -
CMS_htt_tt_btag_13TeV_TTT_bin_16 shape - 1 - - - - -
CMS_htt_tt_btag_13TeV_TTT_bin_8 shape - 1 - - - - -
CMS_htt_tt_btag_13TeV_TTT_bin_9 shape - 1 - - - - -
CMS_htt_tt_btag_13TeV_VVT_bin_17 shape - - 1 - - - -
CMS_htt_tt_btag_13TeV_VVT_bin_18 shape - - 1 - - - -
CMS_htt_tt_btag_13TeV_ZTT_bin_1 shape - - - 1 - - -
CMS_htt_tt_btag_13TeV_ZTT_bin_15 shape - - - 1 - - -
CMS_htt_tt_btag_13TeV_ZTT_bin_17 shape - - - 1 - - -
CMS_htt_tt_btag_13TeV_ZTT_bin_18 shape - - - 1 - - -
CMS_htt_tt_btag_13TeV_ZTT_bin_2 shape - - - 1 - - -
CMS_htt_tt_btag_13TeV_ZTT_bin_3 shape - - - 1 - - -
CMS_htt_tt_btag_13TeV_ZTT_bin_4 shape - - - 1 - - -
CMS_htt_tt_btag_13TeV_ZTT_bin_5 shape - - - 1 - - -
CMS_htt_tt_btag_13TeV_jetFakes_bin_1 shape - - - - 1 - -
CMS_htt_tt_btag_13TeV_jetFakes_bin_10 shape - - - - 1 - -
CMS_htt_tt_btag_13TeV_jetFakes_bin_11 shape - - - - 1 - -
CMS_htt_tt_btag_13TeV_jetFakes_bin_12 shape - - - - 1 - -
CMS_htt_tt_btag_13TeV_jetFakes_bin_13 shape - - - - 1 - -
CMS_htt_tt_btag_13TeV_jetFakes_bin_14 shape - - - - 1 - -
CMS_htt_tt_btag_13TeV_jetFakes_bin_16 shape - - - - 1 - -
CMS_htt_tt_btag_13TeV_jetFakes_bin_17 shape - - - - 1 - -
CMS_htt_tt_btag_13TeV_jetFakes_bin_2 shape - - - - 1 - -
CMS_htt_tt_btag_13TeV_jetFakes_bin_3 shape - - - - 1 - -
CMS_htt_tt_btag_13TeV_jetFakes_bin_5 shape - - - - 1 - -
CMS_htt_tt_btag_13TeV_jetFakes_bin_6 shape - - - - 1 - -
CMS_htt_tt_btag_13TeV_jetFakes_bin_7 shape - - - - 1 - -
CMS_htt_tt_btag_13TeV_jetFakes_bin_8 shape - - - - 1 - -
CMS_htt_tt_btag_13TeV_jetFakes_bin_9 shape - - - - 1 - -
CMS_htt_ttbarAccept_tt_btag_13TeV lnN - 1.004 - - - - -
CMS_htt_vvXsec_13TeV lnN - - 1.05 - - - -
CMS_htt_zjXsec_13TeV lnN 1.04 - - - - - -
CMS_htt_zttAccept_tt_btag_13TeV lnN - - - 1.05 - - -
CMS_scale_j_13TeV lnN 0.96/1.01 - 0.99/1.01 0.99/1.01 - 0.99/1.01 0.99/1.01
CMS_scale_t_1prong0pi0_13TeV shape - 1 1 1 - 1 1
CMS_scale_t_1prong1pi0_13TeV shape - 1 1 1 - 1 1
CMS_scale_t_3prong0pi0_13TeV shape - 1 1 1 - 1 1
QCDScale_QshScale_bbH lnN - - - - - - 0.902
ff_norm_stat_tt_tt_btag lnN - - - - 1.028 - -
ff_norm_syst_tt lnN - - - - 1.1 - -
ff_sub_syst_tt_tt_btag lnN - - - - 1.03 - -
lumi_13TeV lnN 1.025 - 1.025 - - 1.025 1.025
norm_ff_dy_frac_tt_syst shape - - - - 1 - -
norm_ff_qcd_dm0_njet0_tt_stat shape - - - - 1 - -
norm_ff_qcd_dm0_njet1_tt_stat shape - - - - 1 - -
norm_ff_qcd_dm1_njet0_tt_stat shape - - - - 1 - -
norm_ff_qcd_dm1_njet1_tt_stat shape - - - - 1 - -
norm_ff_qcd_tt_syst shape - - - - 1 - -
norm_ff_tt_frac_tt_syst shape - - - - 1 - -
norm_ff_tt_tt_syst shape - - - - 1 - -
norm_ff_w_frac_tt_syst shape - - - - 1 - -
norm_ff_w_tt_syst shape - - - - 1 - -
rate_TT rateParam * TTT 1 [0,5]
rate_ZMM_ZTT_btag rateParam * ZTT 1.02 [0.8,1.2]
136 changes: 136 additions & 0 deletions docs/root_ad_development_notes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
# Introduction
This file is created to keep track of issues and tests conducted for integrating AD in Combine.
## Setup

Using a CMSSW build with ROOT 6.32.06:

```
cmssw-el8
cmsrel CMSSW_14_2_0_pre2_ROOT632
cd CMSSW_14_2_0_pre2_ROOT632/src
git clone https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit.git HiggsAnalysis/CombinedLimit
git checkout roofit_ad_ichep_2024_63206_comp
cd ../../
cmsenv
scram b -j 8
```
## Tested models
Below you can find test results for several models with classes implemented in Combine.

### Discrete profiling

```
text2workspace.py data/ci/datacard_RooMultiPdf.txt.gz -o multipdf_ws.root
python3 scripts/fitRooFitAD.py --input multipdf_ws.root
```
Fails because only `RooAbsReal` parameters are allowed, pdf indices in `RooMultiPdf` models are `RooCategory`.
<details>
<summary><i>Error message</i></summary>

```bash
RooAbsReal* RooAbsPdf::createNLL(RooAbsData& data, const RooLinkedList& cmdArgs) =>
runtime_error: In creation of function nll_func_wrapper wrapper: input param expected to be of type RooAbsReal.
```

</details>

### autoMCStats

```
text2workspace.py data/ci/template-analysis_shapeInterp.txt -o template_autoMCstats_ws.root -m 200
python3 scripts/fitRooFitAD.py --input template_autoMCstats_ws.root
```
Fails because AD is not implemented for "RooRealSumPdf" objects used to model MC statistical (https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit/blob/main/python/ShapeTools.py#L264) uncertainties with `autoMCstats`
<details>
<summary><i>Error message</i></summary>

```bash
[#0] ERROR:Minimization -- An analytical integral function for class "RooRealSumPdf" has not yet been implemented.
Traceback (most recent call last):
File "/afs/cern.ch/work/a/anigamov/CMSSW_14_2_ROOT632_X_2024-10-06-2300/src/HiggsAnalysis/CombinedLimit/scripts/fitRooFitAD.py", line 29, in <module>
nll = pdf.createNLL(data, Constrain=constrain, GlobalObservables=global_observables, EvalBackend="codegen")
File "/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02858/el8_amd64_gcc12/lcg/root/6.32.06-f59fcaa47c786e7268e714e7e477ee41/lib/ROOT/_pythonization/_roofit/_rooabspdf.py", line 116, in createNLL
return self._createNLL["RooLinkedList const&"](args[0], _pack_cmd_args(*args[1:], **kwargs))
cppyy.gbl.std.runtime_error: Could not find "createNLL<RooLinkedList const&>" (set cppyy.set_debug() for C++ errors):
RooAbsReal* RooAbsPdf::createNLL(RooAbsData& data, const RooLinkedList& cmdArgs) =>
runtime_error: An analytical integral function for class "RooRealSumPdf" has not yet been implemented.
```
</details>

### Template based model without MC statistical uncertainties (autoMCStats)

```
text2workspace.py data/ci/template-analysis_shapeInterp_womcstats.txt -o template_ws.root -m 200
python3 scripts/fitRooFitAD.py --input template_ws.root
```
The fit runs with the warning shown below, but results are reasonable
<details>
<summary><i>Warning:</i></summary>

```bash
In module 'RooFitCore':
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02858/el8_amd64_gcc12/lcg/root/6.32.06-f59fcaa47c786e7268e714e7e477ee41/include/RooFit/Detail/MathFuncs.h:365:45: warning: function 'LnGamma' was not differentiated because clad failed to differentiate it and no suitable overload
was found in namespace 'custom_derivatives'
return pdf - weight * std::log(pdf) + TMath::LnGamma(weight + 1);
^
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02858/el8_amd64_gcc12/lcg/root/6.32.06-f59fcaa47c786e7268e714e7e477ee41/include/RooFit/Detail/MathFuncs.h:365:45: note: falling back to numerical differentiation for 'LnGamma' since no suitable overload was found and clad
could not derive it; to disable this feature, compile your programs with -DCLAD_NO_NUM_DIFF
```

</details>

### RooParametricHist

```
text2workspace.py data/ci/datacard_RooParametricHist.txt -o ws_RooParametricHist.root
python3 scripts/fitRooFitAD.py --input ws_RooParametricHist.root
```

<details>
<summary><i>Error message</i></summary>

```bash
[#0] ERROR:InputArguments -- RooHistPdf::weight(shapeBkg_tqq_muonCRfail2016) ERROR: Code Squashing currently only supports uniformly binned cases.
....
^
[#0] ERROR:InputArguments -- Function roo_func_wrapper_0 could not be compiled. See above for details.
Traceback (most recent call last):
File "/afs/cern.ch/work/a/anigamov/CMSSW_14_2_ROOT632_X_2024-10-06-2300/src/HiggsAnalysis/CombinedLimit/scripts/fitRooFitAD.py", line 29, in <module>
nll = pdf.createNLL(data, Constrain=constrain, GlobalObservables=global_observables, EvalBackend="codegen")
File "/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02858/el8_amd64_gcc12/lcg/root/6.32.06-f59fcaa47c786e7268e714e7e477ee41/lib/ROOT/_pythonization/_roofit/_rooabspdf.py", line 116, in createNLL
return self._createNLL["RooLinkedList const&"](args[0], _pack_cmd_args(*args[1:], **kwargs))
cppyy.gbl.std.runtime_error: Could not find "createNLL<RooLinkedList const&>" (set cppyy.set_debug() for C++ errors):
RooAbsReal* RooAbsPdf::createNLL(RooAbsData& data, const RooLinkedList& cmdArgs) =>
runtime_error: Function roo_func_wrapper_0 could not be compiled. See above for details.
```

</details>

### RooHistPdf

```
ulimit -s unlimited
text2workspace.py data/ci/datacard_RooHistPdf.txt.gz -o ws_RooHistPdf.root
python3 scripts/fitRooFitAD.py --input ws_RooHistPdf.root
```
AD is not implemented for `VerticalInterpPdf` class: https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit/blob/roofit_ad_ichep_2024_63206_comp/src/VerticalInterpPdf.cc
<details>
<summary><i>Error message</i></summary>

```bash
[#0] ERROR:Minimization -- An analytical integral function for class "VerticalInterpPdf" has not yet been implemented.
Traceback (most recent call last):
File "/afs/cern.ch/work/a/anigamov/CMSSW_14_2_ROOT632_X_2024-10-06-2300/src/HiggsAnalysis/CombinedLimit/scripts/fitRooFitAD.py", line 29, in <module>
nll = pdf.createNLL(data, Constrain=constrain, GlobalObservables=global_observables, EvalBackend="codegen")
File "/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02858/el8_amd64_gcc12/lcg/root/6.32.06-f59fcaa47c786e7268e714e7e477ee41/lib/ROOT/_pythonization/_roofit/_rooabspdf.py", line 116, in createNLL
return self._createNLL["RooLinkedList const&"](args[0], _pack_cmd_args(*args[1:], **kwargs))
cppyy.gbl.std.runtime_error: Could not find "createNLL<RooLinkedList const&>" (set cppyy.set_debug() for C++ errors):
RooAbsReal* RooAbsPdf::createNLL(RooAbsData& data, const RooLinkedList& cmdArgs) =>
runtime_error: An analytical integral function for class "VerticalInterpPdf" has not yet been implemented.
```

</details>

### Channel masking

Channel masking is implemented [within eval of custom `CachingNLL` class](https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit/blob/main/src/CachingNLL.cc#L1078).
Loading
Loading