Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support variable rebinning #913

Merged
merged 14 commits into from
Aug 23, 2024
Merged

Conversation

Saransh-cpp
Copy link
Member

@Saransh-cpp Saransh-cpp commented Jan 25, 2024

XRef #208

The current interface:

In [1]: import numpy as np
   ...: 
   ...: import boost_histogram as bh
   ...: 
   ...: h = bh.Histogram(bh.axis.Regular(10, 0, 1))
   ...: h.fill(np.random.normal(size=1_000_000))
   ...: rebin = bh.rebin(factor=2)
   ...: h[::rebin]
Out[1]: Histogram(Regular(5, 0, 1), storage=Double()) # Sum: 341605.0 (1000000.0 with flow)

In [2]: rebin = bh.rebin(groups=[1, 2, 3, 4])

In [3]: h[::rebin]
Out[3]: Histogram(Variable([0, 0.1, 0.3, 0.6, 1], metadata=...), storage=Double()) # Sum: 341559.0

In [4]: s = bh.tag.Slicer()
   ...: 
   ...: h = bh.Histogram(
   ...:     bh.axis.Regular(20, 1, 3), bh.axis.Regular(30, 1, 3),
   ...: bh.axis.Regular(40, 1, 3)
   ...: )
   ...: 
   ...: h[{0: s[:: bh.rebin(groups=[1, 2, 3, 4, 10])]}].axes.size
Out[4]: (5, 30, 40)

In [5]: h[{0: s[:: bh.rebin(groups=[1, 2, 3, 4, 10])], 2: s[:: bh.rebin(groups=
    ...: [1, 2 ,3, 4, 10, 20])]}].axes[2].edges
Out[5]: array([1.  , 1.05, 1.15, 1.3 , 1.5 , 2.  , 3.  ])
  • The code is a bit dirty and I don't know if it is perfectly optimized.
  • How should the code handle flow bins?
  • Is there any edge case that I am missing?

cc: @henryiii @matthewfeickert

src/boost_histogram/tag.py Outdated Show resolved Hide resolved
src/boost_histogram/tag.py Outdated Show resolved Hide resolved
@henryiii
Copy link
Member

Ah, yeah, you probably have to use boost-histogram's cast system to go from C++ class to the correct Python class. I can look (hopefully by end of day or tomorrow, as I'll be teaching soon).

@matthewfeickert matthewfeickert added the enhancement New feature or request label Jan 26, 2024
@Saransh-cpp Saransh-cpp marked this pull request as draft January 26, 2024 15:19
src/boost_histogram/_internal/hist.py Outdated Show resolved Hide resolved
src/boost_histogram/_internal/hist.py Outdated Show resolved Hide resolved
src/boost_histogram/_internal/hist.py Outdated Show resolved Hide resolved
src/boost_histogram/__init__.py Outdated Show resolved Hide resolved
@Saransh-cpp Saransh-cpp requested a review from henryiii February 15, 2024 10:52
@Saransh-cpp Saransh-cpp marked this pull request as ready for review March 8, 2024 15:56
@Saransh-cpp Saransh-cpp changed the title (WIP) feat: support full UHI for rebinning feat: support full UHI for rebinning Mar 8, 2024
@github-actions github-actions bot added the needs changelog Might need a changelog entry label Mar 31, 2024
@rkansal47
Copy link

Thanks for this very useful feature! I was wondering if this adds (or could add) support for renaming categorical axis values as well?

@henryiii
Copy link
Member

I still need to review this and make it work on callables.

@henryiii henryiii changed the title feat: support full UHI for rebinning feat: support variable rebinning Aug 22, 2024
Comment on lines 864 to 865
elif getattr(ind.step, "group_mapping", None) is not None:
groups = ind.step.group_mapping(self.axes[i])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: I didn't update this correctly.


from ._internal.typing import AxisLike

__all__ = ("Slicer", "Locator", "at", "loc", "overflow", "underflow", "rebin", "sum")
__all__ = ("Slicer", "Locator", "at", "loc", "overflow", "underflow", "sum", "rebin")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
__all__ = ("Slicer", "Locator", "at", "loc", "overflow", "underflow", "sum", "rebin")
__all__ = ("Slicer", "Locator", "at", "loc", "overflow", "underflow", "rebin", "sum")

@henryiii henryiii merged commit 92df5a6 into scikit-hep:develop Aug 23, 2024
17 checks passed
@Saransh-cpp Saransh-cpp deleted the uhi branch August 23, 2024 21:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request needs changelog Might need a changelog entry
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants