Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add estimate_contrasts method to report() #372

Merged
merged 10 commits into from
Jun 27, 2023
Merged

Conversation

rempsyc
Copy link
Member

@rempsyc rempsyc commented May 6, 2023

For now, I have added the estimate_contrasts effect size (Cohen's d, but does it need to be adjusted for contrasts?) in report, to follow the h.test model, but I wonder whether it should not be done in modelbased directly?

Reprex so far:

library(modelbased)
library(report)

model <- lm(Sepal.Width ~ Species, data = iris)
contr <- estimate_contrasts(model)
#> No variable was specified for contrast estimation. Selecting `contrast = "Species"`.

report(contr)
#>  The marginal contrasts analysis suggests the following. The difference between
#> setosa and versicolor is positive, statistically significant, and large
#> (difference = 0.66, 95% CI [ 0.49, 0.82], t(147) = 9.69, p < .001, Cohen's d =
#> 1.89, 95% CI [ 1.41, 2.36]). The difference between setosa and virginica is
#> positive, statistically significant, and large (difference = 0.45, 95% CI [
#> 0.29, 0.62], t(147) = 6.68, p < .001, Cohen's d = 1.29, 95% CI [ 0.86, 1.72]).
#> The difference between versicolor and virginica is negative, statistically
#> significant, and medium (difference = -0.20, 95% CI [-0.37, -0.04], t(147) =
#> -3.00, p = 0.003, Cohen's d = -0.64, 95% CI [-1.04, -0.24]).
report_table(contr)
#> Level1     |     Level2 | Difference |             CI |   SE | t(147) |      p | Cohen's d | Cohens_d 95% CI
#> ------------------------------------------------------------------------------------------------------------
#> setosa     | versicolor |       0.66 | [ 0.49,  0.82] | 0.07 |   9.69 | < .001 |      1.89 |  [ 1.41,  2.36]
#> setosa     |  virginica |       0.45 | [ 0.29,  0.62] | 0.07 |   6.68 | < .001 |      1.29 |  [ 0.86,  1.72]
#> versicolor |  virginica |      -0.20 | [-0.37, -0.04] | 0.07 |  -3.00 | 0.003  |     -0.64 |  [-1.04, -0.24]

Created on 2023-05-06 with reprex v2.0.2

Fixes #371

@rempsyc rempsyc requested a review from DominiqueMakowski May 6, 2023 17:38
@rempsyc
Copy link
Member Author

rempsyc commented May 6, 2023

Getting same error as in #368 :

── Error ('test-report_performance.R:105:3'): report_performance Bayesian) ─────
Error in `eval(code, test_env)`: cannot change value of locked binding for 'stan_glmer'

@rempsyc rempsyc requested a review from IndrajeetPatil May 6, 2023 21:18
@DominiqueMakowski
Copy link
Member

very cool!!! Yes that would make sense for effectsizes to be added directly in modelbased I think

@rempsyc
Copy link
Member Author

rempsyc commented May 7, 2023

Just now did I realized that you had started working on this in https://github.com/easystats/report/blob/main/old/report.modelbased.R

However, estimate_contrasts objects dot not have the modelbased class, so it was not working. I guess it was work in progress, that's why it was in the old folder.

OK, so I'll transfer the effect size code to modelbased, and then modify this PR accordingly.

@codecov-commenter
Copy link

codecov-commenter commented May 13, 2023

Codecov Report

Merging #372 (2aae74b) into main (ec551a0) will decrease coverage by 0.29%.
The diff coverage is 0.00%.

❗ Current head 2aae74b differs from pull request most recent head ca60c8f. Consider uploading reports for the commit ca60c8f to get more accurate results

@@            Coverage Diff             @@
##             main     #372      +/-   ##
==========================================
- Coverage   71.90%   71.62%   -0.29%     
==========================================
  Files          46       47       +1     
  Lines        3300     3313      +13     
==========================================
  Hits         2373     2373              
- Misses        927      940      +13     
Impacted Files Coverage Δ
R/report.estimate_contrasts.R 0.00% <0.00%> (ø)

@rempsyc
Copy link
Member Author

rempsyc commented May 14, 2023

@DominiqueMakowski I have removed Cohen's d calculation and integration from report since easystats/modelbased#227, but I think we can still keep the rest. If you approve I'll merge this.

@rempsyc rempsyc requested a review from strengejacke May 20, 2023 16:05
@rempsyc
Copy link
Member Author

rempsyc commented May 20, 2023

Updated reprex. @DominiqueMakowski can I merge this? If so kindly approve the PR :P

library(modelbased)
library(report)

model <- lm(Sepal.Width ~ Species, data = iris)
contr <- estimate_contrasts(model)
#> No variable was specified for contrast estimation. Selecting `contrast = "Species"`.

report(contr)
#> The marginal contrasts analysis suggests the following. The difference between
#> setosa and versicolor is positive and statistically significant (difference =
#> 0.66, 95% CI [ 0.49, 0.82], t(147) = 9.69, p < .001). The difference between
#> setosa and virginica is positive and statistically significant (difference =
#> 0.45, 95% CI [ 0.29, 0.62], t(147) = 6.68, p < .001). The difference between
#> versicolor and virginica is negative and statistically significant (difference
#> = -0.20, 95% CI [-0.37, -0.04], t(147) = -3.00, p = 0.003)

report_table(contr)
#> Marginal Contrasts Analysis
#> 
#> Level1     |     Level2 | Difference |         95% CI |   SE | t(147) |      p
#> ------------------------------------------------------------------------------
#> setosa     | versicolor |       0.66 | [ 0.49,  0.82] | 0.07 |   9.69 | < .001
#> setosa     |  virginica |       0.45 | [ 0.29,  0.62] | 0.07 |   6.68 | < .001
#> versicolor |  virginica |      -0.20 | [-0.37, -0.04] | 0.07 |  -3.00 | 0.003 
#> 
#> Marginal contrasts estimated at Species
#> p-value adjustment method: Holm (1979)

Created on 2023-05-20 with reprex v2.0.2

@rempsyc rempsyc requested review from mattansb and etiennebacher June 8, 2023 00:25
@rempsyc rempsyc requested a review from bwiernik June 25, 2023 23:23
@bwiernik
Copy link
Contributor

This LGTM. We can add Cohen's once it's in modelbased.

With respect to adjustment, if there is just one categorical predictor (with however many levels), like in the example, then dividing the contrasts by sigma is fine. If there are multiple predictors or any covariates, it's important to re-compute sigma adding back in the response variance associated with the variables that aren't part of the contrast (or else the Cohen's d scale doesn't really make sense)

@mattansb
Copy link
Member

(or else the Cohen's d scale doesn't really make sense)

Doesn't it? That would be a conditional Cohen's d, no?

(Also, tell that to Rouder et al, who define their default priors in {BayesFactor} on this effect size exactly 🙄)

@DominiqueMakowski
Copy link
Member

Oops I missed this. All good!! Amazing thanks a lot

@rempsyc
Copy link
Member Author

rempsyc commented Jun 27, 2023

@bwiernik, my savior 😻, I had to ask 6 reviewers and wait more than a month to get this simple PR approved 🤣🤣

I'll wait for @strengejacke to fix checks on main and then will squash and merge.

Will also update (in a new PR) when modelbased gets its Cohen's d. About that, @mattansb I got a bit discouraged adding the standardized effect sizes in modelbased because I felt like you would like to have all methods added at once (which is beyond my skill level), and not one at a time, right?

But am I understanding correctly here that I could open a new PR for modelbased, check for the following condition, length(predictor)) == 1 && is.factor(predictor) && is.null(covariates) and if TRUE, add the Cohen d by dividing the contrasts by sigma, and do nothing in all other conditions? 😬

@bwiernik
Copy link
Contributor

That should work for a simple Cohen's d implementation.

For a more complete solution, you could refit the model with just the variables used in the contrasts and get sigma from that.

@strengejacke
Copy link
Member

I'll wait for @strengejacke to fix checks on main and then will squash and merge.

#375 still has issues with snapshot tests. Tests via CI expect "2.93", but locally, it's "2.94", thus there's a mismatch between local and GitHub snapshots

@mattansb
Copy link
Member

@rempsyc I don't mind them one at a time, as long as the default is none 🙃

@rempsyc rempsyc merged commit 75ae425 into main Jun 27, 2023
@rempsyc rempsyc deleted the modelbased_estimate_contrasts branch June 27, 2023 21:02
@rempsyc
Copy link
Member Author

rempsyc commented Jul 2, 2023

 With respect to adjustment, if there is just one categorical predictor (with however many levels), like in the example, then dividing the contrasts by sigma is fine.

@bwiernik for that first scenario, it seems like this is what emmeans is doing (or rather how I implemented it in our emmeans method), so might not be necessary to add as a separate method?

library(modelbased)
model <- lm(Sepal.Width ~ Species, data = iris)

x <- estimate_contrasts(model)
x$Difference
#> [1]  0.658  0.454 -0.204

new <- x$Difference / stats::sigma(model)
new
#> [1]  1.9370732  1.3365216 -0.6005516

y <- estimate_contrasts(model, effectsize = "emmeans")

old <- y$effect_size
old
#> [1]  1.9370732  1.3365216 -0.6005516

Created on 2023-07-01 with reprex v2.0.2

If there are multiple predictors or any covariates, it's important to re-compute sigma adding back in the response variance associated with the variables that aren't part of the contrast (or else the Cohen's d scale doesn't really make sense)

For a more complete solution, you could refit the model with just the variables used in the contrasts and get sigma from that.

I'll work on that scenario next.

@rempsyc
Copy link
Member Author

rempsyc commented Jul 2, 2023

If there are multiple predictors or any covariates, it's important to re-compute sigma adding back in the response variance associated with the variables that aren't part of the contrast (or else the Cohen's d scale doesn't really make sense)

For a more complete solution, you could refit the model with just the variables used in the contrasts and get sigma from that.

So, if I understand correctly, @bwiernik, you mean to extract the sigma used in emmeans not from the original model, but from a model refit without the contrast variable? If so, is the example below what you had in mind?

library(modelbased)

mtcars2 <- mtcars
mtcars2$cyl <- as.factor(mtcars2$cyl)
model <- lm(mpg ~ cyl + wt * hp, mtcars2)

# For a more complete solution, you could refit the model with just the 
# variables used in the contrasts and get sigma from that.

# INTERPRETATION: WE REMOVE CYL FROM MODEL AND GET SIGMA FROM THAT!
refit_model <- lm(mpg ~ wt * hp, mtcars2)
refit_sigma <- sigma(refit_model)

estimated <- get_emcontrasts(model, contrast = "cyl")

eff <- emmeans::eff_size(
  estimated, sigma = refit_sigma,
  edf = stats::df.residual(model), method = "identity")
eff
#>  contrast      effect.size    SE df lower.CL upper.CL
#>  (cyl4 - cyl6)      0.5849 0.697 26   -0.847     2.02
#>  (cyl4 - cyl8)      0.6756 0.963 26   -1.304     2.66
#>  (cyl6 - cyl8)      0.0907 0.697 26   -1.341     1.52
#> 
#> sigma used for effect sizes: 2.153 
#> Confidence level used: 0.95

Created on 2023-07-02 with reprex v2.0.2

Or did you mean to remove everything from the model EXCEPT the contrast variable?

library(modelbased)

mtcars2 <- mtcars
mtcars2$cyl <- as.factor(mtcars2$cyl)
model <- lm(mpg ~ cyl + wt * hp, mtcars2)

# For a more complete solution, you could refit the model with just the 
# variables used in the contrasts and get sigma from that.

# INTERPRETATION: WE REMOVE EVERYTHING EXCEPT CYL FROM MODEL AND GET SIGMA FROM THAT!
refit_model <- lm(mpg ~ cyl, mtcars2)
refit_sigma <- sigma(refit_model)

estimated <- get_emcontrasts(model, contrast = "cyl")

eff <- emmeans::eff_size(
  estimated, sigma = refit_sigma,
  edf = stats::df.residual(model), method = "identity")
eff
#>  contrast      effect.size    SE df lower.CL upper.CL
#>  (cyl4 - cyl6)      0.3906 0.465 26   -0.566     1.35
#>  (cyl4 - cyl8)      0.4512 0.643 26   -0.871     1.77
#>  (cyl6 - cyl8)      0.0606 0.465 26   -0.896     1.02
#> 
#> sigma used for effect sizes: 3.223 
#> Confidence level used: 0.95

Created on 2023-07-02 with reprex v2.0.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add report support for modelbased::estimate_contrasts
6 participants