Replies: 2 comments 10 replies
-
This is a great pass at showing how some simple scoring functionality would work. Given what @sbfnk has said to us about the plans that @nikosbosse has to revamp scoringutils, including possibly not backwards compatible changes, you are thinking about suggesting this as something to tackle after the scoringutils redesign? |
Beta Was this translation helpful? Give feedback.
-
What exactly does the pmf and cdf format look like? One plan of the redesign is that users can provide their own custom functions to Another plan of the redesign is to make it as easy as possible for users to recreate the scoring functionality for new formats. Maybe we could just add support for the pmf and cdf format. Writing a general function to convert from one format to the other would be nice. Potentially we could even support that within scoringutils (but would have to think about whether that's the right place). |
Beta Was this translation helpful? Give feedback.
-
Introduction/Overview
I'm hoping that we can use this discussion as a space to sort out what we want to do about supporting forecast evaluation. Here are a few desiderata, mostly suggested by Logan:
Many of these items are satisfied by scoringutils (at least numbers 1, 2i, most of 3i [aside from per-capita WIS and one-sided quantile coverage], 3ii, and 4). Important items that don't seem to be satisfied include 3iii and 3iv (i.e., there seems to be no support for pmf-formatted forecasts in scoringutils). Currently, some data massaging is required to get between hub formats and formats used by scoringutils, see example below. I haven't really tried to do much that was intensive using the package, so I'm not sure about efficiency -- but I think they're using data.table, which suggests attention has been paid to this issue.
My overall takeaway is that for quantile and sample format forecasts, it is reasonable to just refer users to scoringutils, likely providing a small function that does some data format conversion. We would need to either work with scoringutils maintainers on adding support for pmf and cdf format forecasts, or implement scoring for those output types ourselves.
What evaluation using scoringutils currently looks like
My understanding is that some redesign of scoringutils is currently underway, but to ground the discussion here's a brief overview of what using scoringutils to do forecast evaluation looks like for an example using FluSight forecasts.
I first do some set-up and then load forecast data in the hub format:
This produces the following output:
Note that the FluSight project has forecasts for a categorical target in pmf format, but (a) scoringutils only naturally works with one forecast format at a time (e.g. just quantile forecasts or just sample forecasts), and (b) scoringutils does not support scoring of pmf forecasts). Here we're just filtering to quantile forecasts.
We can also load corresponding target data:
...which looks like this:
The
date
column here matches up withtarget_end_date
in theforecasts
data, and the target value is given byvalue
.To use
scoringutils
, we need to merge the forecast and target data together and get them to have a specific set of column names:We can use
scoringutils::check_forecasts
to confirm that we have things set up correctly.I subset the output here:
For evaluation, you can first compute some "raw" scores:
Then you can add interval coverage measures, and effectively group by and summarize these scores as desired:
Summing up
Here are some condensed thoughts about what we might like to do to support evaluation/scoring of hub model outputs:
data_for_su
object above.Beta Was this translation helpful? Give feedback.
All reactions