correctionlib

Introduction

The purpose of this library is to provide a well-structured JSON data format for a wide variety of ad-hoc correction factors encountered in a typical HEP analysis and a companion evaluation tool suitable for use in C++ and python programs. Here we restrict our definition of correction factors to a class of functions with scalar inputs that produce a scalar output.

In python, the function signature is:

from typing import Union

def f(*args: Union[str,int,float]) -> float:
    return ...

In C++, the evaluator implements this currently as:

double Correction::evaluate(const std::vector<std::variant<int, double, std::string>>& values) const;

The supported function classes include:

multi-dimensional binned lookups;
binned lookups pointing to multi-argument formulas with a restricted math function set (exp, sqrt, etc.);
categorical (string or integer enumeration) maps;
input transforms (updating one input value in place); and
compositions of the above.

Each function type is represented by a "node" in a call graph and holds all of its parameters in a JSON structure, described by the JSON schema. Possible future extension nodes might include weigted sums (which, when composed with the others, could represent a BDT) and perhaps simple MLPs.

The tool should provide:

standardized, versioned JSON schemas;
forward-porting tools (to migrate data written in older schema versions); and
a well-optimized C++ evaluator and python bindings (with numpy vectorization support).

This tool will definitely not provide:

support for TLorentzVector or other object-type inputs (such tools should be written as a higher-level tool depending on this library as a low-level tool)

Formula support currently includes a mostly-complete subset of the ROOT library TFormula class, and is implemented in a threadsafe standalone manner. The parsing grammar is formally defined and parsed through the use of a header-only PEG parser library. The supported features mirror CMSSW's reco::formulaEvaluator and fully passes the test suite for that utility with the purposeful exception of the TMath:: namespace. The python bindings may be able to call into numexpr, though, due to the tree-like structure of the corrections, it may prove difficult to exploit vectorization at levels other than the entrypoint.

Installation

The build process is Makefile-based for the C++ evaluator and via setuptools for the python bindings. Builds have been tested in Windows, OS X, and Linux, and python bindings can be compiled against both python2 and python3, as well as from within a CMSSW environment. The python bindings are distributed as a pip-installable package.

To build in an environment that has python 3, you can simply

pip install correctionlib

(possibly with --user, or in a virtualenv, etc.) Note that CMSSW 11_2_X and above has ROOT accessible from python 3.

If you have a pure C++ framework, you can build the C++ evaluator in most environments via:

git clone --recursive [email protected]:nsmith-/correctionlib.git
cd correctionlib
make
# demo C++ binding, main function at src/demo.cc
gunzip data/examples.json.gz
./demo data/examples.json

Eventually this will be simplified to a pip install and a correction-config utility to retrieve the header and linking flags.

To compile with python2 support, consider using python 3 :) If you considered that and still want to use python2, follow the C++ build instructions and then call make PYTHON=python2 correctionlib to compile. Inside CMSSW you should use make PYTHON=python correctionlib assuming python is the name of the scram tool you intend to link against. This will output a correctionlib directory that acts as a python package, and can be moved where needed. This package will only provide the correctionlib._core evaluator module, as the schema tools and high-level bindings are python3-only.

Creating new corrections

The correctionlib.schemav2 module provides a helpful framework for defining correction objects and correctionlib.convert includes select conversion routines for common types. Nodes can be type-checked as they are constructed using the parse_obj class method or by directly constructing them using keyword arguments. Some examples can be found in data/conversion.py. The tests/ directory may also be helpful.

Developing

See CONTRIBUTING.md

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
.github		.github
EGdata/SF-Repository		EGdata/SF-Repository
cpp-peglib @ e0f1d1e		cpp-peglib @ e0f1d1e
data		data
docs		docs
include		include
pybind11 @ 8e5d3d2		pybind11 @ 8e5d3d2
rapidjson @ 585042c		rapidjson @ 585042c
src		src
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yml		.readthedocs.yml
CONTRIBUTING.md		CONTRIBUTING.md
Evaluator_egcorr.ipynb		Evaluator_egcorr.ipynb
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
VetoSFs_WorkInProgress.ipynb		VetoSFs_WorkInProgress.ipynb
convertRootToNewJson_forEGamma-Combined.ipynb		convertRootToNewJson_forEGamma-Combined.ipynb
convertRootToNewJson_forEGamma-Electrons.ipynb		convertRootToNewJson_forEGamma-Electrons.ipynb
convertRootToNewJson_forEGamma-Photons.ipynb		convertRootToNewJson_forEGamma-Photons.ipynb
convertRootToNewJson_forEGamma.ipynb		convertRootToNewJson_forEGamma.ipynb
demo		demo
egcorrs_Electrons_UL.json		egcorrs_Electrons_UL.json
egcorrs_Photons_UL.json		egcorrs_Photons_UL.json
egcorrs_UL.json		egcorrs_UL.json
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

correctionlib

Introduction

Installation

Creating new corrections

Developing

About

Releases

Packages

Contributors 3

Languages

License

cms-egamma/correctionlib-1

Folders and files

Latest commit

History

Repository files navigation

correctionlib

Introduction

Installation

Creating new corrections

Developing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages