Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Check parameter shapes for pdf API calls #1461

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

kratsg
Copy link
Contributor

@kratsg kratsg commented May 17, 2021

Pull Request Description

Resolves #1459. Check that expected_data, expected_auxdata, and expected_actualdata are being called with the right shape for pars (the parameters) before evaluating. This is usually caught by lower-level code, however the error is not as user-friendly.

A quick note: it is ok to add if/raise in these API calls as they're not meant to be "fast" in our code -- but this could potentially be a problem if we need to allow for autodiff capabilities. For now, we're only enforcing that logpdf() calls are the performant ones.

Checklist Before Requesting Reviewer

  • Tests are passing
  • "WIP" removed from the title of the pull request
  • Selected an Assignee for the PR to be responsible for the log summary

Before Merging

For the PR Assignees:

  • Summarize commit messages into a comprehensive review of the PR
* Add checks on input parameter shapes for the pdf API calls to expected_auxdata,
expected_actualdata, and expected_data.
* Add tests for parameter shape checks to properly raise exceptions.InvalidPdfParameters

@kratsg kratsg added API Changes the public API feat/enhancement New feature or request fix A bug fix labels May 17, 2021
@kratsg kratsg self-assigned this May 17, 2021
@codecov
Copy link

codecov bot commented May 17, 2021

Codecov Report

Merging #1461 (390b100) into master (ce70574) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #1461   +/-   ##
=======================================
  Coverage   98.12%   98.12%           
=======================================
  Files          64       64           
  Lines        4270     4278    +8     
  Branches      683      687    +4     
=======================================
+ Hits         4190     4198    +8     
  Misses         46       46           
  Partials       34       34           
Flag Coverage Δ
contrib 26.20% <0.00%> (-0.05%) ⬇️
doctest 60.47% <0.00%> (-0.12%) ⬇️
unittests 96.18% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/pyhf/pdf.py 97.85% <100.00%> (+0.05%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ce70574...390b100. Read the comment docs.

@lukasheinrich
Copy link
Contributor

I wonder whether for these types of API we shsould have a pattern

def method(self,...):
    # check inputs
    ....
   self._method(...)

such that for performance critical or internal paths (i.e. where pyhf itself calls method we're able to call the "unsafe" _method - relying on prior checks, while the user gets a friendly API with safety checks

@kratsg
Copy link
Contributor Author

kratsg commented May 17, 2021

I wonder whether for these types of API we shsould have a pattern

I had this idea where I wanted to use decorators. It can introspect the arguments and do the checks based on something consistent, rather than copy/paste code around a lot. In this case, something like @pyhf.checks.pars that checks the number of parameters passed in for pdf calls, and would become a passthrough based on some pyhf.config configuration or similar.

raise exceptions.InvalidPdfParameters(
f'eval failed as pars has len {pars.shape[-1]} but {self.config.npars} was expected'
)

return self.make_pdf(pars)[1].expected_data()

def _modifications(self, pars):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that we do input validation here it suggests that this has become a publicly consumable API. maybe we should add a model.modifications and do the input checks there and call _modifications if inputs are ok

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_modifications isn't currently a public API. This certainly could become one, although I might argue that we remove it from Model and keep it on MainModel unless there's a reason to pass-through it. e.g. pdf.main_model.modifications is just as clear to me. Unless the suggestion here is to remove the checks from main_model and constraint_model and keep all checks on model which is also possible, but feels like a mess.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

main_model is needed for return_by_sample, so it feels similarly "public" as Model.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but main_model.expected_actualdata(..., return_by_sample) is fine -- since that's public. However main_model._modifications isn't necessarily public. Although at the moment only used by main_model.expected_data -- so I'm fine either way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless the suggestion here is to remove the checks from main_model and constraint_model

My comment was meant as an example for why that may not be desirable, sorry I should have made that clear. Promoting _modifications to public in the longer term is a nice idea, it looks very useful for model debugging.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created an issue regarding the proposal of making _modifications public: #1652.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as it is now this will be called on each logpdf call.. so it's perf. critical.. should we have some kind of split of "pdf.method" and "pdf._method_unsafe"? or do we think it doesn't make a difference?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lukasheinrich @kratsg If we can revisit the performance impact here soon it would be nice to have this get into v0.7.0.

@matthewfeickert matthewfeickert added the tests pytest label May 17, 2021
Copy link
Member

@matthewfeickert matthewfeickert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all LGTM @kratsg — thanks. I'll let you and @lukasheinrich resolve the current discussion and implement any changes that you want, but I'm happy to have this merged whenever you both are.

src/pyhf/pdf.py Outdated Show resolved Hide resolved
@matthewfeickert matthewfeickert force-pushed the feat/assertExpectedDataAPI branch from e36109f to 390b100 Compare December 10, 2021 20:43
@matthewfeickert matthewfeickert changed the base branch from master to main September 21, 2022 20:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Changes the public API feat/enhancement New feature or request fix A bug fix tests pytest
Projects
Status: In progress
Development

Successfully merging this pull request may close these issues.

Input validation for pyhf.pdf.Model.expected_data
4 participants