A simple config-based tool for high-energy-physics machine learning tasks.
Full documentation and instructions to use are available here:
Currently supports |
Binary-classification (currently using XGBoost and DNN) |
Examples: DY vs ttbar, DY prompt vs DY fake, good electrons vs bad electrons |
Multi-sample classification (currently using XGBoost and DNN) |
Examples: DY vs (ttbar and QCD) |
Multi-class classification (currently using XGBoost and DNN) |
Examples: DY vs ttbar vs QCD, , good photons vs bad photons |
Salient features: |
Parallel reading of root files (using DASK) |
Runs on flat ntuples (even NanoAODs) out of the box |
Adding multiple MVAs is very trivial (Subject to available computing power) |
Cross-section and pt-eta reweighting can be handled together |
Multi-Sample training possible |
Multi-Class training possible |
Ability to customize thresholds |
What will be the output of the trainer: |
Feature distributions |
Statistics in training and testing |
ROCs, loss plots, MVA scores |
Confusion Matrices |
Correlation plots |
Trained models (h5/pb for DNN / pkl for XGBoost) |
Optional outputs
- Threshold values of scores for chosen working points
- Efficiency vs pT and Efficiency vs eta plots for all classes
- Reweighting plots for pT and eta
- Comparison of new ID performance with benchmark ID flags