Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poisson likelihood without uncertainty implementation #22

Merged
merged 7 commits into from
Nov 19, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions docs/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,14 @@ Default PDFs
third_moment.third_moment_expansion
third_moment.compute_third_moments

Simple PDFs
~~~~~~~~~~~

.. autosummary::
:toctree: _generated/

simple_pdf.Poisson


Exceptions
----------
Expand Down
32 changes: 32 additions & 0 deletions docs/plugins.rst
Original file line number Diff line number Diff line change
Expand Up @@ -283,6 +283,38 @@ Once again, the exclusion limit can be computed as
* ``analysis`` (optional): Unique identifier for the analysis.
* ``xsection`` (optional): Cross-section value for the signal hypothesis. Units determined by the user.

``'default_pdf.poisson'``
~~~~~~~~~~~~~~~~~~~~~~~~~

Simple Poisson implementation without uncertainties which can be described as follows;

.. math::

\mathcal{L}(\mu) = \prod_{i\in{\rm bins}}{\rm Poiss}(n^i|\mu n_s^i + n_b^i)

It can take any number of yields.

.. code-block:: python3
:linenos:

>>> pdf_wrapper = spey.get_backend("default_pdf.poisson")
>>> statistical_model = pdf_wrapper(
... signal_yields=[12.0, 15.0],
... background_yields=[50.0,48.0],
... data=[36, 33],
... analysis="example",
... xsection=0.123,
... )

**Arguments:**

* ``signal_yields``: keyword for signal yields. It can take one or more values as a list or NumPy array.
* ``background_yields``: keyword for background-only expectations. It can take one or more values as a list or NumPy array.
* ``data``: keyword for observations. It can take one or more values as a list or NumPy array.
* ``analysis`` (optional): Unique identifier for the analysis.
* ``xsection`` (optional): Cross-section value for the signal hypothesis. Units determined by the user.


External Plug-ins
-----------------

Expand Down
9 changes: 8 additions & 1 deletion docs/releases/changelog-v0.1.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Release notes v0.1.2
# Release notes v0.1.3

## New features since last release

Expand All @@ -9,6 +9,10 @@
* $\chi^2$ function has been extended to compute background + signal vs background only model.
([#17](https://github.com/SpeysideHEP/spey/pull/17))

* Poisson based likelihood constructor without uncertainties has been added
(Request by Veronica Sanz for EFT studies).
([#22](https://github.com/SpeysideHEP/spey/pull/22))

## Improvements

* Backend inspection has been added for the models that act like intermediate functions.
Expand All @@ -23,6 +27,9 @@
chi-square computer. See issue [#19](https://github.com/SpeysideHEP/spey/issues/19).
([#20](https://github.com/SpeysideHEP/spey/pull/20))

* Execution error fix during likelihood computation for models with single nuisance parameter.
([#22](https://github.com/SpeysideHEP/spey/pull/22))

## Contributors

This release contains contributions from (in alphabetical order):
Expand Down
1 change: 1 addition & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
"default_pdf.correlated_background = spey.backends.default_pdf:CorrelatedBackground",
"default_pdf.third_moment_expansion = spey.backends.default_pdf:ThirdMomentExpansion",
"default_pdf.effective_sigma = spey.backends.default_pdf:EffectiveSigma",
"default_pdf.poisson = spey.backends.default_pdf.simple_pdf:Poisson",
]

setup(
Expand Down
2 changes: 1 addition & 1 deletion src/spey/_version.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
"""Version number (major.minor.patch[-label])"""

__version__ = "0.1.2"
__version__ = "0.1.3"
289 changes: 289 additions & 0 deletions src/spey/backends/default_pdf/simple_pdf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,289 @@
"""This file contains basic likelihood implementations"""

from typing import Callable, List, Optional, Text, Tuple, Union

from autograd import grad, hessian
from autograd import numpy as np

from spey._version import __version__
from spey.backends.distributions import MainModel
from spey.base import BackendBase, ModelConfig
from spey.utils import ExpectationType

# pylint: disable=E1101,E1120,W0613


class Poisson(BackendBase):
r"""
Poisson distribution without uncertainty implementation.

.. math::

\mathcal{L}(\mu) = \prod_{i\in{\rm bins}}{\rm Poiss}(n^i|\mu n_s^i + n_b^i)

where :math:`n_{s,b}` are signal and background yields and :math:`n` are the observations.

Args:
signal_yields (``List[float]``): signal yields
background_yields (``List[float]``): background yields
data (``List[int]``): data
"""

name: Text = "default_pdf.poisson"
"""Name of the backend"""
version: Text = __version__
"""Version of the backend"""
author: Text = "SpeysideHEP"
"""Author of the backend"""
spey_requires: Text = __version__
"""Spey version required for the backend"""

__slots__ = ["_model", "_main_model"]

def __init__(
self,
signal_yields: List[float],
background_yields: List[float],
data: List[int],
):
self.data = np.array(data, dtype=np.float64)
self.signal_yields = np.array(signal_yields, dtype=np.float64)
self.background_yields = np.array(background_yields, dtype=np.float64)

minimum_poi = -np.inf
if self.is_alive:
minimum_poi = -np.min(
self.background_yields[self.signal_yields > 0.0]
/ self.signal_yields[self.signal_yields > 0.0]
)

self._main_model = None

self._config = ModelConfig(
poi_index=0,
minimum_poi=minimum_poi,
suggested_init=[1.0],
suggested_bounds=[(minimum_poi, 10)],
)

@property
def is_alive(self) -> bool:
"""Returns True if at least one bin has non-zero signal yield."""
return np.any(self.signal_yields > 0.0)

def config(
self, allow_negative_signal: bool = True, poi_upper_bound: float = 10.0
) -> ModelConfig:
r"""
Model configuration.

Args:
allow_negative_signal (``bool``, default ``True``): If ``True`` :math:`\hat\mu`
value will be allowed to be negative.
poi_upper_bound (``float``, default ``40.0``): upper bound for parameter
of interest, :math:`\mu`.

Returns:
~spey.base.ModelConfig:
Model configuration. Information regarding the position of POI in
parameter list, suggested input and bounds.
"""
if allow_negative_signal and poi_upper_bound == 10.0:
return self._config

return ModelConfig(
self._config.poi_index,
self._config.minimum_poi,
self._config.suggested_init,
[(0, poi_upper_bound)],
)

@property
def main_model(self) -> MainModel:
"""retreive the main model distribution"""
if self._main_model is None:

def lam(pars: np.ndarray) -> np.ndarray:
"""
Compute lambda for Main model with third moment expansion.
For details see above eq 2.6 in :xref:`1809.05548`

Args:
pars (``np.ndarray``): nuisance parameters

Returns:
``np.ndarray``:
expectation value of the poisson distribution with respect to
nuisance parameters.
"""
return pars[0] * self.signal_yields + self.background_yields

self._main_model = MainModel(lam)

return self._main_model

def get_objective_function(
self,
expected: ExpectationType = ExpectationType.observed,
data: Optional[np.ndarray] = None,
do_grad: bool = True,
) -> Callable[[np.ndarray], Union[Tuple[float, np.ndarray], float]]:
r"""
Objective function i.e. negative log-likelihood, :math:`-\log\mathcal{L}(\mu, \theta)`

Args:
expected (~spey.ExpectationType): Sets which values the fitting algorithm should focus and
p-values to be computed.

* :obj:`~spey.ExpectationType.observed`: Computes the p-values with via post-fit
prescriotion which means that the experimental data will be assumed to be the truth
(default).
* :obj:`~spey.ExpectationType.aposteriori`: Computes the expected p-values with via
post-fit prescriotion which means that the experimental data will be assumed to be
the truth.
* :obj:`~spey.ExpectationType.apriori`: Computes the expected p-values with via pre-fit
prescription which means that the SM will be assumed to be the truth.
data (``np.ndarray``, default ``None``): input data that to fit
do_grad (``bool``, default ``True``): If ``True`` return objective and its gradient
as ``tuple`` if ``False`` only returns objective function.

Returns:
``Callable[[np.ndarray], Union[float, Tuple[float, np.ndarray]]]``:
Function which takes fit parameters (:math:`\mu` and :math:`\theta`) and returns either
objective or objective and its gradient.
"""
current_data = (
self.background_yields if expected == ExpectationType.apriori else self.data
)
data = current_data if data is None else data

def negative_loglikelihood(pars: np.ndarray) -> np.ndarray:
"""Compute twice negative log-likelihood"""
return -self.main_model.log_prob(pars, data)

if do_grad:
grad_negative_loglikelihood = grad(negative_loglikelihood, argnum=0)
return lambda pars: (
negative_loglikelihood(pars),
grad_negative_loglikelihood(pars),
)

return negative_loglikelihood

def get_logpdf_func(
self,
expected: ExpectationType = ExpectationType.observed,
data: Optional[np.array] = None,
) -> Callable[[np.ndarray, np.ndarray], float]:
r"""
Generate function to compute :math:`\log\mathcal{L}(\mu, \theta)` where :math:`\mu` is the
parameter of interest and :math:`\theta` are nuisance parameters.

Args:
expected (~spey.ExpectationType): Sets which values the fitting algorithm should focus and
p-values to be computed.

* :obj:`~spey.ExpectationType.observed`: Computes the p-values with via post-fit
prescriotion which means that the experimental data will be assumed to be the truth
(default).
* :obj:`~spey.ExpectationType.aposteriori`: Computes the expected p-values with via
post-fit prescriotion which means that the experimental data will be assumed to be
the truth.
* :obj:`~spey.ExpectationType.apriori`: Computes the expected p-values with via pre-fit
prescription which means that the SM will be assumed to be the truth.
data (``np.array``, default ``None``): input data that to fit

Returns:
``Callable[[np.ndarray], float]``:
Function that takes fit parameters (:math:`\mu` and :math:`\theta`) and computes
:math:`\log\mathcal{L}(\mu, \theta)`.
"""
current_data = (
self.background_yields if expected == ExpectationType.apriori else self.data
)
data = current_data if data is None else data

return lambda pars: self.main_model.log_prob(pars, data)

def get_hessian_logpdf_func(
self,
expected: ExpectationType = ExpectationType.observed,
data: Optional[np.ndarray] = None,
) -> Callable[[np.ndarray], float]:
r"""
Currently Hessian of :math:`\log\mathcal{L}(\mu, \theta)` is only used to compute
variance on :math:`\mu`. This method returns a callable function which takes fit
parameters (:math:`\mu` and :math:`\theta`) and returns Hessian.

Args:
expected (~spey.ExpectationType): Sets which values the fitting algorithm should focus and
p-values to be computed.

* :obj:`~spey.ExpectationType.observed`: Computes the p-values with via post-fit
prescriotion which means that the experimental data will be assumed to be the truth
(default).
* :obj:`~spey.ExpectationType.aposteriori`: Computes the expected p-values with via
post-fit prescriotion which means that the experimental data will be assumed to be
the truth.
* :obj:`~spey.ExpectationType.apriori`: Computes the expected p-values with via pre-fit
prescription which means that the SM will be assumed to be the truth.
data (``np.ndarray``, default ``None``): input data that to fit

Returns:
``Callable[[np.ndarray], float]``:
Function that takes fit parameters (:math:`\mu` and :math:`\theta`) and
returns Hessian of :math:`\log\mathcal{L}(\mu, \theta)`.
"""
current_data = (
self.background_yields if expected == ExpectationType.apriori else self.data
)
data = current_data if data is None else data

def log_prob(pars: np.ndarray) -> np.ndarray:
"""Compute log-probability"""
return self.main_model.log_prob(pars, data)

return hessian(log_prob, argnum=0)

def get_sampler(self, pars: np.ndarray) -> Callable[[int], np.ndarray]:
r"""
Retreives the function to sample from.

Args:
pars (``np.ndarray``): fit parameters (:math:`\mu` and :math:`\theta`)
include_auxiliary (``bool``): wether or not to include auxiliary data
coming from the constraint model.

Returns:
``Callable[[int, bool], np.ndarray]``:
Function that takes ``number_of_samples`` as input and draws as many samples
from the statistical model.
"""

def sampler(sample_size: int, *args, **kwargs) -> np.ndarray:
"""
Fucntion to generate samples.

Args:
sample_size (``int``): number of samples to be generated.

Returns:
``np.ndarray``:
generated samples
"""
return self.main_model.sample(pars, sample_size)

return sampler

def expected_data(self, pars: List[float], **kwargs) -> List[float]:
r"""
Compute the expected value of the statistical model

Args:
pars (``List[float]``): nuisance, :math:`\theta` and parameter of interest,

Returns:
``List[float]``:
Expected data of the statistical model
"""
return self.main_model.expected_data(pars)
Loading