Skip to content

Commit

Permalink
Merge branch 'master' into reimplement_ATLAS_Z0_7TEV_46FB_CC_AND_CF
Browse files Browse the repository at this point in the history
  • Loading branch information
ecole41 authored Dec 9, 2024
2 parents 7273e27 + 15d1d62 commit 36e4dbc
Show file tree
Hide file tree
Showing 391 changed files with 53,372 additions and 17,325 deletions.
25 changes: 25 additions & 0 deletions .github/workflows/pytorch_test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
name: Test pytorch

on: [push]

jobs:
run_pytorch:
runs-on: ubuntu-latest
env:
KERAS_BACKEND: torch
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.12"
- name: Install nnpdf without LHAPDF
shell: bash -l {0}
run: |
pip install .[nolha,torch]
# Since there is no LHAPDF in the system, initialize the folder and download pdfsets.index
lhapdf-management update --init
- name: Test we can run one runcard
shell: bash -l {0}
run: |
cd n3fit/runcards/examples
n3fit Basic_runcard.yml 4
2 changes: 1 addition & 1 deletion .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ jobs:
strategy:
matrix:
os: [ubuntu-latest, macos-14]
python-version: ["3.10"] # We need an older python version to avoid conflict with the pymongo pin
python-version: ["3.12"]
fail-fast: false
runs-on: ${{ matrix.os }}
env:
Expand Down
7 changes: 4 additions & 3 deletions conda-recipe/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,9 @@ requirements:
- pip
run:
- python >=3.9,<3.13
- tensorflow >=2.10,<2.17 # 2.17 works ok but the conda-forge package for macos doesn't
- psutil
- tensorflow >=2.17
- keras >=3.1
- psutil # to ensure n3fit affinity is with the right processors
- hyperopt
- mongodb
- pymongo <4
Expand All @@ -44,7 +45,7 @@ requirements:
- joblib
- sphinx_rtd_theme >0.5
- sphinxcontrib-bibtex
- ruamel.yaml <0.18
- ruamel.yaml >=0.15

test:
requires:
Expand Down
2 changes: 1 addition & 1 deletion doc/sphinx/source/get-started/nnpdfmodules.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ for an NNPDF fit is displayed in the figure below.
The :ref:`n3fit <n3fitindex>` fitting code
--------------------------------------------------------------------------------
This module implements the core fitting methodology as implemented through
the ``TensorFlow`` framework. The ``n3fit`` library allows
the ``Keras`` framework. The ``n3fit`` library allows
for a flexible specification of the neural network model adopted to
parametrise the PDFs, whose settings can be selected automatically via
the built-in :ref:`hyperoptimization algorithm <hyperoptimization>`. These
Expand Down
3 changes: 1 addition & 2 deletions doc/sphinx/source/n3fit/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,7 @@ Fitting code: ``n3fit``
- ``n3fit`` is the next generation fitting code for NNPDF developed by the
N3PDF team :cite:p:`Carrazza:2019mzf`
- ``n3fit`` is responsible for fitting PDFs from NNPDF4.0 onwards.
- The code is implemented in python using `Tensorflow <https://www.tensorflow.org>`_
and `Keras <https://keras.io/>`_.
- The code is implemented in python using `Keras <https://keras.io/>`_ and can run with `Tensorflow <https://www.tensorflow.org>`_ (default) or `pytorch <https://pytorch.org>`_ (with the environment variable ``KERAS_BACKEND=torch``).
- The sections below are an overview of the ``n3fit`` design.


Expand Down
71 changes: 33 additions & 38 deletions doc/sphinx/source/n3fit/methodology.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ different in comparison to the latest NNPDF (i.e. `NNPDF3.1 <https://arxiv.org/a
methodology.

.. warning::
The default implementation of the concepts presented here are implemented with Keras and
Tensorflow. The ``n3fit`` code inherits its features, so in this document we avoid the discussion of
The default implementation of the concepts presented here are implemented with Keras.
The ``n3fit`` code inherits its features, so in this document we avoid the discussion of
specific details which can be found in the `Keras documentation <https://keras.io/>`_.

.. note::
Expand Down Expand Up @@ -90,7 +90,7 @@ random numbers used in training-validation, ``nnseed`` for the neural network in
Neural network architecture
---------------------------

The main advantage of using a modern deep learning backend such as Keras/Tensorflow consists in the
The main advantage of using a modern deep learning backend such as Keras consists in the
possibility to change the neural network architecture quickly as the developer is not forced to fine
tune the code in order to achieve efficient memory management and PDF convolution performance.

Expand Down Expand Up @@ -132,41 +132,36 @@ See the `Keras documentation <https://www.tensorflow.org/api_docs/python/tf/kera

.. code-block:: python
from tensorflow.keras.utils import plot_model
from n3fit.model_gen import pdfNN_layer_generator
from validphys.api import API
fit_info = API.fit(fit="NNPDF40_nnlo_as_01180_1000").as_input()
basis_info = fit_info["fitting"]["basis"]
pdf_models = pdfNN_layer_generator(
nodes=[25, 20, 8],
activations=["tanh", "tanh", "linear"],
initializer_name="glorot_normal",
layer_type="dense",
flav_info=basis_info,
fitbasis="EVOL",
out=14,
seed=42,
dropout=0.0,
regularizer=None,
regularizer_args=None,
impose_sumrule="All",
scaler=None,
parallel_models=1,
)
pdf_model = pdf_models[0]
nn_model = pdf_model.get_layer("NN_0")
msr_model = pdf_model.get_layer("impose_msr")
models_to_plot = {
'plot_pdf': pdf_model,
'plot_nn': nn_model,
'plot_msr': msr_model
}
for name, model in models_to_plot.items():
plot_model(model, to_file=f"./{name}.png", show_shapes=True)
from keras.utils import plot_model
from n3fit.model_gen import pdfNN_layer_generator
from validphys.api import API
fit_info = API.fit(fit="NNPDF40_nnlo_as_01180_1000").as_input()
basis_info = fit_info["fitting"]["basis"]
pdf_model = pdfNN_layer_generator(
nodes=[25, 20, 8],
activations=["tanh", "tanh", "linear"],
initializer_name="glorot_normal",
layer_type="dense",
flav_info=basis_info,
fitbasis="EVOL",
out=14,
seed=42,
dropout=0.0,
regularizer=None,
regularizer_args=None,
impose_sumrule="All",
scaler=None,
)
nn_model = pdf_model.get_layer("pdf_input")
msr_model = pdf_model.get_layer("impose_msr")
models_to_plot = {
'plot_pdf': pdf_model,
'plot_nn': nn_model,
'plot_msr': msr_model
}
This will produce for instance the plot of the PDF model below, and can also be used to plot the
Expand Down
10 changes: 5 additions & 5 deletions doc/sphinx/source/n3fit/runcard_detailed.rst
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ The fraction of events that are considered for the training and validation sets
dataset_inputs:
- { dataset: SLAC_NC_NOTFIXED_P_EM-F2, frac: 0.75, variant: legacy_dw}
It is possible to run a fit with no validation set by setting the fraction to ``1.0``, in this case the training set will be used as validation set.

The random seed for the training/validation split is defined by the variable ``trvlseed``.
Expand Down Expand Up @@ -280,7 +280,7 @@ of better than 35%) or higher.
Inspecting and profiling the code
---------------------------------

It is possible to inspect the ``n3fit`` code using `TensorBoard <https://www.tensorflow.org/tensorboard/>`_.
It is possible to inspect the ``n3fit`` code using `TensorBoard <https://www.tensorflow.org/tensorboard/>`_ when running with the tensorflow backend.
In order to enable the TensorBoard callback in ``n3fit`` it is enough with adding the following options in the runcard:


Expand Down Expand Up @@ -333,7 +333,7 @@ top-level option:
parallel_models: true
Note that currently, in order to run with parallel models, one has to set ``savepseudodata: false``
in the ``fitting`` section of the runcard. Once this is done, the user can run ``n3fit`` with a
in the ``fitting`` section of the runcard. Once this is done, the user can run ``n3fit`` with a
replica range to be parallelized (in this case from replica 1 to replica 4).

.. code-block:: bash
Expand All @@ -346,8 +346,8 @@ should run by setting the environment variable ``CUDA_VISIBLE_DEVICES``
to the right index (usually ``0, 1, 2``) or leaving it explicitly empty
to avoid running on GPU: ``export CUDA_VISIBLE_DEVICES=""``

Note that in order to run the replicas in parallel using the GPUs of an Apple Silicon computer (like M1 Mac), it is necessary to also install
the following packages:
Note that in order to run the replicas in parallel using the GPUs of an Apple Silicon computer (like M1 Mac), it is necessary to also install
extra packages. At the timing of writing this worked with ``tensorflow`` 2.13.

.. code-block:: bash
Expand Down
35 changes: 18 additions & 17 deletions doc/sphinx/source/tutorials/run-fit.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ example of the ``parameter`` dictionary that defines the Machine Learning framew
dropout: 0.0
...
The runcard system is designed such that the user can utilize the program
The runcard system is designed such that the user can utilize the program
without having to tinker with the codebase.
One can simply modify the options in ``parameters`` to specify the
desired architecture of the Neural Network as well as the settings for the optimization algorithm.
Expand Down Expand Up @@ -164,7 +164,7 @@ folder, which contains a number of files:
- ``runcard.exportgrid``: a file containing the PDF grid.
- ``runcard.json``: Includes information about the fit (metadata, parameters, times) in json format.

.. note::
.. note::

The reported χ² refers always to the actual χ², i.e., without positivity loss or other penalty terms.

Expand All @@ -184,25 +184,26 @@ After obtaining the fit you can proceed with the fit upload and analisis by:

Performance of the fit
----------------------
The ``n3fit`` framework is currently based on `Tensorflow <https://www.tensorflow.org/>`_ and as such, to
first approximation, anything that makes Tensorflow faster will also make ``n3fit`` faster.

.. note::

Tensorflow only supports the installation via pip. Note, however, that the TensorFlow
pip package has been known to break third party packages. Install it at your own risk.
Only the conda tensorflow-eigen package is tested by our CI systems.

When you install the nnpdf conda package, you get the
`tensorflow-eigen <https://anaconda.org/anaconda/tensorflow-eigen>`_ package,
which is not the default. This is due to a memory explosion found in some of
The ``n3fit`` framework is currently based on `Keras <https://keras.io/>`_
and it is tested to run with the `Tensorflow <https://www.tensorflow.org/>`_
and `pytorch <https://pytorch.org>`_ backends.
This also means that anything that make any of these packages faster will also
make ``n3fit`` faster.
Note that at the time of writing, ``TensorFlow`` is approximately 4 times faster than ``pytorch``.

The default backend for ``keras`` is ``tensorflow``.
In order to change the backend, the environment variable ``KERAS_BACKENDD`` need to be set (e.g., ``KERAS_BACKEND=torch``).

The best results are obtained with ``tensorflow[and-cuda]`` installed from pip.
When you install the nnpdf conda package, you get the
`tensorflow-eigen <https://anaconda.org/anaconda/tensorflow-eigen>`_ package,
which is not the default. This is due to a memory explosion found in some of
the conda mkl builds.

If you want to disable MKL without installing ``tensorflow-eigen`` you can always
If you want to disable MKL without installing ``tensorflow-eigen`` you can always
set the environment variable ``TF_DISABLE_MKL=1`` before running ``n3fit``.
When running ``n3fit`` all versions of the package show similar performance.


When using the MKL version of tensorflow you gain more control of the way Tensorflow will use
the multithreading capabilities of the machine by using the following environment variables:

Expand All @@ -214,7 +215,7 @@ the multithreading capabilities of the machine by using the following environmen
These are the best values found for ``n3fit`` when using the mkl version of Tensorflow from conda
and were found for TF 2.1 as the default values were suboptimal.
For a more detailed explanation on the effects of ``KMP_AFFINITY`` on the performance of
the code please see
the code please see
`here <https://software.intel.com/content/www/us/en/develop/documentation/cpp-compiler-developer-guide-and-reference/top/optimization-and-programming-guide/openmp-support/openmp-library-support/thread-affinity-interface-linux-and-windows.html>`_.

By default, ``n3fit`` will try to use as many cores as possible, but this behaviour can be overriden
Expand Down
4 changes: 2 additions & 2 deletions extra_tests/regression_checks.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
import pytest

from n3fit.tests.test_fit import EXE, check_fit_results
from reportengine.compat import yaml
from validphys.utils import yaml_safe

REGRESSION_FOLDER = pathlib.Path(__file__).with_name("regression_fits")

Expand Down Expand Up @@ -37,7 +37,7 @@ def test_regression_fit(tmp_path, runcard, replica, regenerate):
runcard_file = REGRESSION_FOLDER / runcard_name
shutil.copy(runcard_file, tmp_path)

runcard_info = yaml.load(runcard_file.read_text())
runcard_info = yaml_safe.load(runcard_file.read_text())
if (wname := runcard_info.get("load")) is not None:
shutil.copy(REGRESSION_FOLDER / wname, tmp_path)

Expand Down
4 changes: 2 additions & 2 deletions n3fit/src/evolven3fit/evolve.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@

import eko
from eko import basis_rotation, runner
from reportengine.compat import yaml
from validphys.utils import yaml_safe

from . import eko_utils, utils

Expand Down Expand Up @@ -164,7 +164,7 @@ def load_fit(usr_path):
nnfitpath = usr_path / "nnfit"
pdf_dict = {}
for yaml_file in nnfitpath.glob(f"replica_*/{usr_path.name}.exportgrid"):
data = yaml.safe_load(yaml_file.read_text(encoding="UTF-8"))
data = yaml_safe.load(yaml_file.read_text(encoding="UTF-8"))
pdf_dict[yaml_file.parent.stem] = data
return pdf_dict

Expand Down
8 changes: 3 additions & 5 deletions n3fit/src/evolven3fit/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@
import numpy as np
from scipy.interpolate import interp1d

from reportengine.compat import yaml
from validphys.pdfbases import PIDS_DICT
from validphys.utils import yaml_safe

from .q2grids import Q2GRID_DEFAULT, Q2GRID_NNPDF40

Expand Down Expand Up @@ -57,7 +57,7 @@ def hasFlavor(self, pid):

def read_runcard(usr_path):
"""Read the runcard and return the relevant information for evolven3fit"""
return yaml.safe_load((usr_path / "filter.yml").read_text(encoding="UTF-8"))
return yaml_safe.load((usr_path / "filter.yml").read_text(encoding="UTF-8"))


def get_theoryID_from_runcard(usr_path):
Expand Down Expand Up @@ -99,9 +99,7 @@ def generate_q2grid(Q0, Qfin, Q_points, match_dict, nf0=None, legacy40=False):
frac_of_point = np.log(match_scale / Q_ini) / np.log(Qfin / Q0)
num_points = int(Q_points * frac_of_point)
num_points_list.append(num_points)
grids.append(
np.geomspace(Q_ini**2, match_scale**2, num=num_points, endpoint=False)
)
grids.append(np.geomspace(Q_ini**2, match_scale**2, num=num_points, endpoint=False))
Q_ini = match_scale
num_points = Q_points - sum(num_points_list)
grids.append(np.geomspace(Q_ini**2, Qfin**2, num=num_points))
Expand Down
Loading

0 comments on commit 36e4dbc

Please sign in to comment.