Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exposure fitting, curve calibration & reach and frequency allocator #1132

Open
wants to merge 28 commits into
base: main
Choose a base branch
from

Conversation

gufengzhou
Copy link
Contributor

Type of change

  • feat: Exposure fitting is being reactivated. When paid_media_vars are different than paid_media_spend, paid_media_vars will be used for fitting. The extrapolation from exposure to spend uses the metric CPE (cost per exposure) = spend / exposure. Respectively, Michaelis Menten function for spend-exposure fitting is deprecated, because it doesn't provide significant improvement in fitting, while being very error-prone. The missing nonlinear relationship between spend and exposure will be address by curve calibration.
  • feat: curve calibration. The new robyn_calibrate feature works standalone and without the other main Robyn functions. It consumes a dataframe a two columns: spend and response. This allows users to provide any saturation curves as source of truth to improve Robyn's saturation parameter identification and narrow down sample the ranges for alpha and gamma. Typical source of truth for saturation curves are Meta conversion lift, as well as Halo cumulative reach.
  • feat: reach and frequency allocator allows users to find optimum reach and frequency combination for media planning.
  • Refactor: aligned and simplified nonlinear transformation process across functions like run_transformation, robyn_response, robyn_pareto and robyn_allocator
  • Refactor: inflexion point calculation now changed from .dot_product(range(x), gamma) to sum(x) * gamma to allow more flexibility in curve estimation
  • various fixes
  • update of documentation

Unit test (tbc)

gufengzhou and others added 25 commits July 30, 2024 23:57
- in order to use real exposure metrics in modeling, fit_spend_exposure must be inverted into spend ~ exposure
- adapted vmax start value to avoid 0 denominator in the inverted equation
- adapted spend exposure plots
- the impact on saturation curve and budget allocation needs to be reassessed
- now only warns weak relationship when both rsq_nls and rsq_lm are smaller than threshold
- exposure_handling function for better R&F integration and readability
- better col readability in dt_plotNLS
- simplify dt_transform ETL
This is the proof of concept of a R&F allocator that includes
- Simulated R&F data
- Implemented multiplicative model
- visualisation of surface
- R&F allocator with nlopt
- constrain validation
- Imp/GRP will be used for parent model fitting. This addition aims to decompose saturated imp into its R&F component with separated saturation
- R&F will obtain the same adstock param as imp and be transformed. No separated adstock param estimation
- Then adstocked R&F will be fitted with different hill transformations (diff set of alphas /gammas)
- The R&F hill params are estimated using a multiplicative equation with Nevergrad
- First simulated results shows close to perfect R&F->Imp fitting.
- Deprecate function fit_spend_exposure, incl. Michaelis Menten. Nonlinear fitting between spend and exposure wasn't improving fitting significantly. Instead, future curve calibration feature will aim to improve curve identification.
- Use linear model only: cpm as ratio for spend to exposure translation.
- remove minpack.lm / nlsLM dependency
- robyn_input() works
- to-do: export InputCollect$ExposureCollect$plot_spend_exposure
- update exposure handling, esp. introduce metric "cost per exposure" as linear scaler between expo and spend.
- use cpe_window to scale the whole dataset in order to obtain the right spend scale for modeling period.
- simplify check_varnames
- update check_paidmedia
- update  check_factorvars
- exposure_handling with scaled spend will now replace respective media exposure vars in dt_transform
- adapt model.R, incl. reset run_transformations params to have clearer overview of params needed.
- simplify transformation.R by removing unnecessary checks
- remove documentation for fit_spend_exposure
- include rlang::`:=` operater for easier dynamic variable assignment in the future
- check with document update successful. no error /no warning/ no notes
- In model.R & pareto.R: remove decompSpendDist from  both scripts to reduce memory leak. Use xDecompAgg subsets instead
- In transformation.R & response.R: unify transformation namings in run_transformation and robyn_response
- In response.R: remove exposure extrapolation because it's already done in robyn_input. Also add inflexion point to output.
- In plots.R: fix onepager saturation plot issues
- In pareto.R: rewrite run_dt_resp() as response_wrapper and align transformation logic & naming.
- In pareto: Replace foreach response loop with lapply for simplicity.
- In pareto.R: Simplify plot data generation process, esp for saturation curve plot, actual vs predicted plot & immediate vs carryover plot.
- In pareto.R: Remove redundancy in xDecompVecCollect -> remove type rawMedia, rawSpend, predictedExposure, saturatedMedia & saturatedSpendReversed. Only keep adstockedMedia & decompMedia for response curve plotting.
- add titles and change y axis.
- use reach Halo cumulative reach and simulated spend
- curve fitting with Hill using Nevergrad
- early stop convergence with while loop
- CI range for alpha & gamma
- plotting
- Create robyn_calibrate that consumes curve input and outputs hyperparameter ranges as input.
- Rename previous internal robyn_calibrate function as lift_calibration
- add support function .dot_product, .qti, .mse_loss, geom_density_ci, check_qti
- to-do: Dataframe input df_curve_sot needs to be extended for multiple campaigns. Plot needs to be exported. New checks needed too.
- include inflexion into resultHypParam for better curve calibration handling
- fix plot warning
- replace paid_media_spend with paid_media_selected in script
- clean up variable retrieval and sorting
- deprecate get_hill_params because inflexions are now included in resultHypParam
- fix transformation loop error
- add set_default_hyppar for easier testing
- add param force_curve in robyn_calibrate to allow c or s shape control
- change inflexion calculation to sum(x) * gamma to increase flexibility of inflexion point
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 12, 2024
simplfy allocator checks
- deprecate robyn_object
- simplify usecase
- create internal function transform_decomp to standardise adstock -> saturation -> decomp
- remove response_wrapper and place with robyn_response
- add param calibrate_inflexion in transform_decomp to consume outcome from robyn_calibrate for saturation
- update check_metric_type to allow both spend and exposure names
adapt the function to the latest changes of saturation_hill
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants