Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the dependency on pygam #796

Open
jeongyoonlee opened this issue Oct 4, 2024 · 1 comment
Open

Remove the dependency on pygam #796

jeongyoonlee opened this issue Oct 4, 2024 · 1 comment
Assignees
Labels
dependencies Pull requests that update a dependency file

Comments

@jeongyoonlee
Copy link
Collaborator

Is your feature request related to a problem? Please describe.
pygam pins scipy < 1.12, which blocks the use of numpy 2.x and Cython 3.x.

Describe the solution you'd like
Currently, it is used in propensity.calibrate().

We can evaluate alternative calibration methods, e.g., scikit-learn's probability calibration.

Describe alternatives you've considered
Alternatively, we can work on pygam directly to see if we can make it work with scipy >= 1.12

Additional context
N/A

@jeongyoonlee jeongyoonlee added enhancement New feature or request dependencies Pull requests that update a dependency file and removed enhancement New feature or request labels Oct 4, 2024
@jeongyoonlee jeongyoonlee self-assigned this Oct 4, 2024
@jeongyoonlee jeongyoonlee assigned ras44 and unassigned jeongyoonlee Nov 8, 2024
@ras44
Copy link
Collaborator

ras44 commented Nov 9, 2024

hi @jeongyoonlee I took a quick look at this in this notebook:

https://github.com/ras44/causalml/blob/ras44/remove_pygam_dep_796_dev/docs/examples/pygam_removal_and_testing.ipynb

If we look at log_loss and brier_score_loss independently as indicators of calibration quality (as described here), then it looks like we should be able to swap out pygam with sci-kit learn's IsotonicRegression and either maintain or improve performance:

def calibrate(ps, treatment):
    """Calibrate propensity scores with IsotonicRegression.

    Args:
        ps (numpy.array): a propensity score vector
        treatment (numpy.array): a binary treatment vector (0: control, 1: treated)

    Returns:
        (numpy.array): a calibrated propensity score vector
    """
    pm_ir = IsotonicRegression(out_of_bounds="clip")
    ps_ir = pm_ir.fit_transform(ps, treatment)

    return ps_ir

At the end of the notebook above, I show that IsotonicRegression performs better than or equal to pygam for a couple test cases:

image

ras44 pushed a commit to ras44/causalml that referenced this issue Nov 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file
Projects
None yet
Development

No branches or pull requests

2 participants