An extension of sklearn's Lasso/ElasticNet/Ridge model to allow users to customize the penalties of different covariates. Works similar to penalty.factor
parameter in R's glmnet.
Sometimes we have prior knowledge that some covariates are important and some are not. Like weekend and holiday should be strong predictors to daily traffic flow, gender should be a strong predictor to breast cancer risk, whereas ice-cream sales should not contribute to crime rate. In such cases, when doing regularized linear regression, we want to penalize certain covariates with different weights.
This module also allows one to do basic two-step adaptive regularized regression.
Two classes, CustomENet and CustomENetCV, are provided for regression and cross-validation along the regularization path. Both accept an additional penalty weight parameter for fit. See the example notebook and docstrings for details.
Let me explain with Lasso for simplicity. The extension to ElasticNet and Ridge is trivial.
In regular Lasso we penalize each covariate with equal weight, i.e., 1. Now we define a vector
-
$s_i>0$ and is finite. -
$s_i=0$ , do not at all penalize$\beta_i$ -
$s_i=+\infty$ , set$\beta_i = 0$
For 3 we simply remove the corresponding covariates. For 1 & 2 we split
The algorithm is inspired by this answer: https://stats.stackexchange.com/a/307133/68424