Cross-validation with ridge regression but not normal linear regression? #44

CooperNederhood · 2018-05-25T05:13:40Z

From my review it seems everyone does cross-validation with ridge regression. But no one ever seems to do cross-validation with a simple linear regression, and I understand ridge regression is just linear regression with an L2 penalty term on the objective function - so what gives? Am I lunatic if I just run ridge regression without cross-validation?

Any thoughts are appreciated!

bensoltoff · 2018-05-25T13:16:51Z

I would assert most social scientists do not employ cross-validation for most regression models - linear regression, logistic regression, generalized linear models, etc. It doesn't make it right, it's just that most social scientists are not trained in cross-validation methods and don't know how or when to employ them. Is there a reason you do not want to use cross-validation with your ridge regression model?

CooperNederhood · 2018-05-25T20:46:13Z

I am really just looking for an R-squared measure for my linear regression, so if I were doing a linear regression I'd just get the normal R-squared from the regression output. But I am doing the ridge regression to avoid overfitting (I have more regressors than observations. Yikes!) so I just want the corresponding R-squared measure for the ridge regression model.

The way I think about it, the cross-validation error is really an estimate of the out of sample fit while the linear regression R-squared is just the in sample fit. So they're really fundamentally different ways of assessing model fitness, right?

For my purposes, given I'm interested in a predictive model and wary of overfitting I think the cross-validation approach is best (and that is how I've implemented it). Thanks very much for the thoughts!

bensoltoff · 2018-05-26T17:28:26Z

Yes, cross-validation is better. Even for inferential studies, cross-validation or out-of-sample test statistics really are better for model comparison and assessing overall model fit. As we've discussed extensively in the perspectives sequence, relying on training set statistics can lead you to biased models and incorrect inferences/predictions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cross-validation with ridge regression but not normal linear regression? #44

Cross-validation with ridge regression but not normal linear regression? #44

CooperNederhood commented May 25, 2018

bensoltoff commented May 25, 2018

CooperNederhood commented May 25, 2018

bensoltoff commented May 26, 2018

Cross-validation with ridge regression but not normal linear regression? #44

Cross-validation with ridge regression but not normal linear regression? #44

Comments

CooperNederhood commented May 25, 2018

bensoltoff commented May 25, 2018

CooperNederhood commented May 25, 2018

bensoltoff commented May 26, 2018