Skip to content

abdelazizhussein/vbfml

 
 

Repository files navigation

ML training tools for VBF H(inv)

Coverage Status

Features

Feature Status
Single-file reading done
Multi-file reading done
Keras conformant output done
Train / validation / test splitting done
Per-event weights from input tree done
Per-dataset weight modifiers (xs, sumw) done

Setup

Note: Set up a python virtual environment before installing to make your life easier.

git clone [email protected]:AndreasAlbert/vbfml.git
python3 -m pip install -e vbfml

# Install pre-commit hooks to automatically
# format code when committing
pre-commit install 

Contributing

Code quality is ensured through extensive testing. When developing a new feature, please write unit tests at the same time. Check out the tests directory to see existing tests for inspiration. Tests are executed using pytest, which is also automatically done for each pull request through github actions. Make sure that all tests pass:

cd vbfml
python3 -m pytest

To avoid having to deal with coding style questions, all code is formatted with black. Please format your code before committing:

# individual file:
black myfile.py

# or whole folder:
black vbfml

Usage example

Check out the scripts/train.py script for an example training work flow. You can set up a new training area:

./scripts/train.py --tag mytag setup

This will create a new working area under the ./output/mytag/ directory. In this directory, all ingredients for training will be saved.

To run training, use e.g.:

./scripts/train.py --tag mytag setup --training-passes 5 --learning-rate 1e-1

The --training-passes argument specifies how often the entire training data set will be processed (In case of the keras setup we use, this is not necessarily equal the number of epochs, which have a user-defined length). The learning rate argument can be omitted, in which case the rate from the last training (or the default one from the setup if no training has happened yet) will be used. If you have already run training in this working area and decide to train again, you can pick up where you left off.

The scripts/analyze_training.py script can be used to create typical postprocessing plots, such as variable distributions. It can be extended with further postprocessing:

./scripts/analyze_training.py plot /path/to/training/folder/

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%