You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Consider the following Gaussian distribution example:
library(gamboostLSS)
# g_muvsi is a function that allows to see which distribution
# parameter is updated in each iteration.
# If mu is updated, then returns 1, if sigma is updated, then it returns 2.
# The first value in the risk vector is removed as it is the initial value.
g_muvsi <- function(gamboostLSS_object) {
df1 <- data.frame(risk = risk(gamboostLSS_object)$mu[-1], muvsi = 1)
df2 <- data.frame(risk = risk(gamboostLSS_object)$sigma[-1], muvsi = 2)
df <- rbind(df1, df2)
df <- df[order(df$risk, decreasing = TRUE), ]
return(df$muvsi)
}
set.seed(1)
n <- 3
x1 <- c(1,2,3)
mu <- x1
sigma <- exp(.1 * x1)
data <- cbind(x1)
y <- rnorm(3, mu, sigma)
# center x1
cx1 <- scale(x1, center = T, scale = F)
b_g <- gamboostLSS(y ~ bols(cx1),
control = boost_control(mstop = 10),
method = "noncyclic")
By applying the g_muvsi() function to the gamboostLSS object we can find that sigma is updated in the third boosting iteration.
> g_muvsi(b_g)
[1] 1 2 2 1 2 2 1 2 2 1
However, if the negative gradient for mu and sigma after the second boosting iteration is caluclated manually
which means that mu should be updated in third iteration, but not sigma.
According to my debug analysis, I've found that the empirical risk for one distribution parameter in iteration [m] depends on which distribution parameter was updated in the previous iteration [m-1]. If, like this example, sigma is updated in the second iteration, then the risk for mu in the third iteration is calculated with the following formula
Comparing but_risk_mu and risk_si, the winner is sigma, just the same as the gamboostLSS output.
Similarly, if mu is updated in iteration [m-1], then the risk for sigma in iteration [m] is not calculated like risk_si, but a value without step length updates
Consider the following Gaussian distribution example:
By applying the g_muvsi() function to the gamboostLSS object we can find that sigma is updated in the third boosting iteration.
However, if the negative gradient for mu and sigma after the second boosting iteration is caluclated manually
fit them with linear regression
and finally calculate the empirical risk with step length 0.1
we'll get
which means that mu should be updated in third iteration, but not sigma.
According to my debug analysis, I've found that the empirical risk for one distribution parameter in iteration [m] depends on which distribution parameter was updated in the previous iteration [m-1]. If, like this example, sigma is updated in the second iteration, then the risk for mu in the third iteration is calculated with the following formula
Comparing
but_risk_mu
andrisk_si
, the winner is sigma, just the same as the gamboostLSS output.Similarly, if mu is updated in iteration [m-1], then the risk for sigma in iteration [m] is not calculated like risk_si, but a value without step length updates
I think the problem appears somewhere between row 243 and 287 in
gamboostLSS/R/mboostLSS.R
Line 243 in f97c890
The text was updated successfully, but these errors were encountered: