Predictions for future n steps #243
-
I recently found this work and would like to use it for predicting the future 'n' values in a regression. If yes, could someone please point me to an example? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 7 replies
-
Sure, this is possible, but it has more to do with how you shape your data than how to use ngboost. I'm assuming here that for a single unit of observation i in your dataset you have [yi0, yi1 .... yit ... yit+n] so you are trying to use the values [yi0 ... yit] to predict yit+n for some fixed n. At the moment there isn't a simple way to predict all of [yit+1 ... yit+n] in one shot, so you have to use one model for t+1, another model for t+2, and so on up to t+n (*footnote). So proceeding with some fixed n, what you need to do is shape your data so that you have a matrix X' and a vector Y'. Each row of X' will be X'i = [yi0, yi1 .... yit] and the corresponding element of Y' is Y'i = yit+n. Now simply feed X' and Y' into ngboost and you'll have a model capable of predicting yt+n from any vector [y0, y1 .... yt]. You can also include additional time-varying or static predictors as different rows of X' if you have them. It's important to understand, however, that this model has absolutely no notion of "time". As far as it knows, y1, y2, etc. are unrelated numbers that may or may not have anything to do with each other and which may be combined in whatever way is best to predict yt+n. It also does not know or care that yt+n represents the same "measurement" as yt, just at a future time. Whether or not that is a feature or a downside of the approach depends on your goals and perspectives. (*footnote) while there is not a currently-implemented approach that I would advise for simultaneous prediction of [yt+1 ... yt+n], it is absolutely possible in theory. As long as you can parametrize the vector [yt+1 ... yt+n] with finite-dimensional vector of parameters (e.g. with an AR1 model) then you can apply the ngboost algorithm to estimate them and therefore get probabilistic prediction over [yt+1 ... yt+n] simultaneously. Technically you could do this with the multivariate normal, but if n is large then that's a ton of parameters to fit and it will take forever. (and it ignores linear time). So probably best to think about an autoregressive distribution for this problem, which we don't have implemented and which I had actually not thought of until now. |
Beta Was this translation helpful? Give feedback.
Sure, this is possible, but it has more to do with how you shape your data than how to use ngboost. I'm assuming here that for a single unit of observation i in your dataset you have [yi0, yi1 .... yit ... yit+n] so you are trying to use the values [yi0 ... yit] to predict yit+n for some fixed n. At the moment there isn't a simple way to predict all of [yit+1 ... yit+n] in one shot, so you have to use one model for t+1, another model for t+2, and so on up to t+n (*footnote). So proceeding with some fixed n, what you need to do is shape your data so that you have a matrix X' and a vector Y'. Each row of X' will be X'i = [yi0, yi1 .... yit] and the corresponding element of Y' is Y'i = yit+n…