Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add effect size (Cohen's d) to estimate_contrasts #227

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

rempsyc
Copy link
Member

@rempsyc rempsyc commented May 8, 2023

As discussed in easystats/report#372

library(modelbased)

model <- lm(Sepal.Width ~ Species, data = iris)
estimate_contrasts(model)
#> No variable was specified for contrast estimation. Selecting `contrast = "Species"`.
#> Marginal Contrasts Analysis
#> 
#> Level1     |     Level2 | Difference |         95% CI |   SE | t(147) |      p | Cohen's d | Cohens_d 95% CI
#> ------------------------------------------------------------------------------------------------------------
#> setosa     | versicolor |       0.66 | [ 0.49,  0.82] | 0.07 |   9.69 | < .001 |      1.89 |  [ 1.41,  2.36]
#> setosa     |  virginica |       0.45 | [ 0.29,  0.62] | 0.07 |   6.68 | < .001 |      1.29 |  [ 0.86,  1.72]
#> versicolor |  virginica |      -0.20 | [-0.37, -0.04] | 0.07 |  -3.00 | 0.003  |     -0.64 |  [-1.04, -0.24]
#> 
#> Marginal contrasts estimated at Species
#> p-value adjustment method: Holm (1979)

model <- lm(Sepal.Width ~ Species * Petal.Width, data = iris)
estimate_contrasts(model)
#> No variable was specified for contrast estimation. Selecting `contrast = "Species"`.
#> NOTE: Results may be misleading due to involvement in interactions
#> Marginal Contrasts Analysis
#> 
#> Level1     |     Level2 | Difference |        95% CI |   SE | t(144) |      p | Cohen's d | Cohens_d 95% CI
#> -----------------------------------------------------------------------------------------------------------
#> setosa     | versicolor |       1.59 | [ 0.64, 2.54] | 0.39 |   4.04 | < .001 |      1.89 |  [ 1.41,  2.36]
#> setosa     |  virginica |       1.77 | [ 0.77, 2.78] | 0.41 |   4.29 | < .001 |      1.29 |  [ 0.86,  1.72]
#> versicolor |  virginica |       0.18 | [-0.17, 0.54] | 0.15 |   1.27 | 0.205  |     -0.64 |  [-1.04, -0.24]
#> 
#> Marginal contrasts estimated at Species
#> p-value adjustment method: Holm (1979)

data <- mtcars
data$cyl <- as.factor(data$cyl)
data$am <- as.factor(data$am)
model <- rstanarm::stan_glm(mpg ~ cyl * am, data = data, refresh = 0)
estimate_contrasts(model)
#> No variable was specified for contrast estimation. Selecting `contrast = "cyl"`.
#> NOTE: Results may be misleading due to involvement in interactions
#> Marginal Contrasts Analysis
#> 
#> Level1 | Level2 | Difference |        95% CI |     pd | % in ROPE | Cohen's d | Cohens_d 95% CI
#> -----------------------------------------------------------------------------------------------
#> cyl4   |   cyl6 |       5.62 | [2.31,  8.85] | 99.95% |        0% |      1.88 |    [0.72, 3.01]
#> cyl4   |   cyl8 |      10.23 | [7.06, 13.48] |   100% |        0% |      3.26 |    [2.02, 4.48]
#> cyl6   |   cyl8 |       4.67 | [1.34,  8.00] | 99.62% |        0% |      2.05 |    [0.91, 3.14]
#> 
#> Marginal contrasts estimated at cyl

Created on 2023-05-07 with reprex v2.0.2

Though it will skip it edge cases where the following arguments are not NULL: contrast, fixed, and at, because those made it hard to compute Cohen's d, which right now fetches the original data to compute it. I don't know if the contrasts modified data is available anywhere so we could calculate proper effect sizes for those cases?

For now though I guess it is still a minor improvement to no effect sizes at all.


@rempsyc rempsyc requested a review from DominiqueMakowski May 8, 2023 00:09
@DominiqueMakowski
Copy link
Member

thanks! and do add yourself in the authors desc

@rempsyc rempsyc closed this May 20, 2023
@rempsyc rempsyc reopened this Jul 1, 2023
@rempsyc rempsyc marked this pull request as draft July 1, 2023 15:07
…, one of "none" (default), "emmeans", or "bootES"
@codecov
Copy link

codecov bot commented Jul 1, 2023

Codecov Report

Merging #227 (3658fa1) into main (052f73e) will decrease coverage by 0.84%.
Report is 1 commits behind head on main.
The diff coverage is 10.00%.

❗ Current head 3658fa1 differs from pull request most recent head 7ebdc3c. Consider uploading reports for the commit 7ebdc3c to get more accurate results

@@            Coverage Diff             @@
##             main     #227      +/-   ##
==========================================
- Coverage   35.10%   34.27%   -0.84%     
==========================================
  Files          25       25              
  Lines        1168     1208      +40     
==========================================
+ Hits          410      414       +4     
- Misses        758      794      +36     
Files Changed Coverage Δ
R/estimate_contrasts.R 56.32% <10.00%> (-39.43%) ⬇️

@rempsyc
Copy link
Member Author

rempsyc commented Jul 1, 2023

Current reprexes:

library(modelbased)

model <- lm(Sepal.Width ~ Species, data = iris)

estimate_contrasts(model)
#> No variable was specified for contrast estimation. Selecting `contrast = "Species"`.
#> Marginal Contrasts Analysis
#> 
#> Level1     |     Level2 | Difference |         95% CI |   SE | t(147) |      p
#> ------------------------------------------------------------------------------
#> setosa     | versicolor |       0.66 | [ 0.49,  0.82] | 0.07 |   9.69 | < .001
#> setosa     |  virginica |       0.45 | [ 0.29,  0.62] | 0.07 |   6.68 | < .001
#> versicolor |  virginica |      -0.20 | [-0.37, -0.04] | 0.07 |  -3.00 | 0.003 
#> 
#> Marginal contrasts estimated at Species
#> p-value adjustment method: Holm (1979)

estimate_contrasts(model, effectsize = "cohens_d")
#> No variable was specified for contrast estimation. Selecting `contrast = "Species"`.
#> Unsupported effect size 'cohens_d', returning none.
#> Marginal Contrasts Analysis
#> 
#> Level1     |     Level2 | Difference |         95% CI |   SE | t(147) |      p
#> ------------------------------------------------------------------------------
#> setosa     | versicolor |       0.66 | [ 0.49,  0.82] | 0.07 |   9.69 | < .001
#> setosa     |  virginica |       0.45 | [ 0.29,  0.62] | 0.07 |   6.68 | < .001
#> versicolor |  virginica |      -0.20 | [-0.37, -0.04] | 0.07 |  -3.00 | 0.003 
#> 
#> Marginal contrasts estimated at Species
#> p-value adjustment method: Holm (1979)

estimate_contrasts(model, effectsize = "emmeans")
#> No variable was specified for contrast estimation. Selecting `contrast = "Species"`.
#> Marginal Contrasts Analysis
#> 
#> Level1     |     Level2 | Difference |         95% CI |   SE | t(147) |      p | effect_size |      es 95% CI
#> -------------------------------------------------------------------------------------------------------------
#> setosa     | versicolor |       0.66 | [ 0.49,  0.82] | 0.07 |   9.69 | < .001 |        1.94 | [ 1.48,  2.39]
#> setosa     |  virginica |       0.45 | [ 0.29,  0.62] | 0.07 |   6.68 | < .001 |        1.34 | [ 0.91,  1.76]
#> versicolor |  virginica |      -0.20 | [-0.37, -0.04] | 0.07 |  -3.00 | 0.003  |       -0.60 | [-1.00, -0.20]
#> 
#> Marginal contrasts estimated at Species
#> p-value adjustment method: Holm (1979)

set.seed(100)
estimate_contrasts(model, effectsize = "bootES")
#> No variable was specified for contrast estimation. Selecting `contrast = "Species"`.
#> Marginal Contrasts Analysis
#> 
#> Level1     |     Level2 | Difference |         95% CI |   SE | t(147) |      p | cohens.d | cohens.d 95% CI
#> -----------------------------------------------------------------------------------------------------------
#> setosa     | versicolor |       0.66 | [ 0.49,  0.82] | 0.07 |   9.69 | < .001 |     1.94 |  [ 1.52,  2.39]
#> setosa     |  virginica |       0.45 | [ 0.29,  0.62] | 0.07 |   6.68 | < .001 |     1.34 |  [ 0.82,  1.72]
#> versicolor |  virginica |      -0.20 | [-0.37, -0.04] | 0.07 |  -3.00 | 0.003  |    -0.60 |  [-0.92, -0.22]
#> 
#> Marginal contrasts estimated at Species
#> p-value adjustment method: Holm (1979)

set.seed(100)
estimate_contrasts(model, effectsize = "bootES", bootES_type = "akp.robust.d")
#> No variable was specified for contrast estimation. Selecting `contrast = "Species"`.
#> Marginal Contrasts Analysis
#> 
#> Level1     |     Level2 | Difference |         95% CI |   SE | t(147) |      p | akp.robust.d | akp.robust.d 95% CI
#> -------------------------------------------------------------------------------------------------------------------
#> setosa     | versicolor |       0.66 | [ 0.49,  0.82] | 0.07 |   9.69 | < .001 |         1.88 |      [ 1.42,  2.49]
#> setosa     |  virginica |       0.45 | [ 0.29,  0.62] | 0.07 |   6.68 | < .001 |         1.37 |      [ 0.92,  1.78]
#> versicolor |  virginica |      -0.20 | [-0.37, -0.04] | 0.07 |  -3.00 | 0.003  |        -0.51 |      [-0.90, -0.17]
#> 
#> Marginal contrasts estimated at Species
#> p-value adjustment method: Holm (1979)

Created on 2023-07-01 with reprex v2.0.2

For the emmeans effect size, I just called it effect_size, because I don't know if we can call it Cohen's d (or something else).

I updated the to-do list at the top post with the missing effect sizes

R/estimate_contrasts.R Outdated Show resolved Hide resolved
@rempsyc
Copy link
Member Author

rempsyc commented Jul 2, 2023

Alright, I added a new method, "marginal", based on Brenton's formula. That said,

library(modelbased)

mtcars2 <- mtcars
mtcars2$cyl <- as.factor(mtcars2$cyl)

model <- lm(mpg ~ cyl + wt * hp, mtcars2)

d values from method “marginal”,

estimate_contrasts(model, effectsize = "marginal")
#> No variable was specified for contrast estimation. Selecting `contrast = "cyl"`.
#> Marginal Contrasts Analysis
#> 
#> Level1 | Level2 | Difference |        95% CI |   SE | t(26) |      p | marginal_d
#> ---------------------------------------------------------------------------------
#> cyl4   |   cyl6 |       1.26 | [-2.55, 5.07] | 1.49 |  0.85 | > .999 |       0.19
#> cyl4   |   cyl8 |       1.45 | [-3.83, 6.74] | 2.06 |  0.70 | > .999 |       0.22
#> cyl6   |   cyl8 |       0.20 | [-3.64, 4.03] | 1.50 |  0.13 | > .999 |       0.03
#> 
#> Marginal contrasts estimated at cyl
#> p-value adjustment method: Holm (1979)

differ a lot from "emmeans",

estimate_contrasts(model, effectsize = "emmeans")
#> No variable was specified for contrast estimation. Selecting `contrast = "cyl"`.
#> Marginal Contrasts Analysis
#> 
#> Level1 | Level2 | Difference |        95% CI |   SE | t(26) |      p | effect_size |     es 95% CI
#> --------------------------------------------------------------------------------------------------
#> cyl4   |   cyl6 |       1.26 | [-2.55, 5.07] | 1.49 |  0.85 | > .999 |        0.57 | [-0.83, 1.97]
#> cyl4   |   cyl8 |       1.45 | [-3.83, 6.74] | 2.06 |  0.70 | > .999 |        0.66 | [-1.27, 2.60]
#> cyl6   |   cyl8 |       0.20 | [-3.64, 4.03] | 1.50 |  0.13 | > .999 |        0.09 | [-1.31, 1.49]
#> 
#> Marginal contrasts estimated at cyl
#> p-value adjustment method: Holm (1979)

Although they seem consistent. Did I get the formula right? I used

R2_cov <- summary(model)$r.squared
d_adj <- contrasts$t * contrasts$SE / sigma(model) * sqrt(1 - R2_cov)

Created on 2023-07-02 with reprex v2.0.2

@rempsyc
Copy link
Member Author

rempsyc commented Jul 2, 2023

Method marginal (should we use a different name, like marginal_d?) for now also does not have confidence intervals. I know that easystats/effectsize#351 has them in the large function, but I wasn't sure if I should implement them here in modelbased or wait for it to be formally implemented in effectsize (since the issue belongs to that repo). At least for now we have an easy way to get the marginal d-value itself.

@rempsyc
Copy link
Member Author

rempsyc commented Aug 5, 2023

@bwiernik if you think this looks good after reading my previous three comments above, then I think we'll be ready to think about merging this. Anything else you would like me to implement (e.g., see from list from original post)?

#' _d_ will be overestimated.
#'
#' `effectsize = "marginal"` uses the following formula to compute effect
#' size: `d_adj <- t * se_b / sigma * sqrt(1 - R2_cov)`. This standardized
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that t * se_b is equal to the raw difference - so this can be rewritten as difference * (1- R2)/ sigma, which I think is more clear.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also means that emmeans::eff_size() can be used here with sigma = stats::sigma(model) / (1 - R2_cov) (and some reasonable edf)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, one thing to note, is that previously I hardcoded sigma as as sigma = stats::sigma(model). Would it be preferable to add a new argument sigma with that as the default, but then people can specify a different one when using the emmeans method? And Should this argument also apply to Brenton's marginal method? (again, instead of hard coding it as sigma(model))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, in the emmeans method, edf is currently hardcoded as edf = stats::df.residual(model), and method as method = "identity". But I can add arguments for those as well (I'm always unsure whether the ellipsis three dots would work when they can be used in several functions with similar argument names, it would probably create conflict for method for example).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, in that case you're just making a wrapper around eff_size, yeah?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. But I don't know what we should do (wrapper around eff_size with some defaults, or fixed values). Do you have a preference?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nothing 😜🤷‍♂️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants