-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjust epi_slide_opt
output naming; adjust epi_slide*
autogrouping; fix tidyselect issue
#564
base: dev
Are you sure you want to change the base?
Conversation
7a8ec75
to
85503c1
Compare
1627870
to
e6f7264
Compare
|
e6f7264
to
6a22c4b
Compare
- BREAKING CHANGE: adjust default output column naming scheme, disallow overwriting columns.
882e960
to
9669773
Compare
fdc9ee7
to
79ed2f3
Compare
- Show naming options, including with multi-column selections when we have accommodating example data sets - Select away the pre-existing 7d aggregations in the example data set - Ungroup output
845f4c0
to
beb434c
Compare
45170b1
to
f9a8356
Compare
ed23df0
to
6d8bcf6
Compare
epi_slide_opt
output naming; adjust epi_slide*
autogrouping; fix tidyselect bug
epi_slide_opt
output naming; adjust epi_slide*
autogrouping; fix tidyselect bugepi_slide_opt
output naming; adjust epi_slide*
autogrouping; fix tidyselect issue
We may later need it to make default args, in which case we should probably export to be less "magical".
@nmdefries sorry for all the noise here; I tried to implement just 1 time utility from #494 but forgot it actually relied on the concept of a "unit" time delta/step; fixed that up and added the unit time step helper. Think that's fixed now. Also fixed some CHECK issues. The core changes, and now these helpers, should hopefully be ready now. |
#' @param .prefix Optional [`glue::glue`] format string; name the slide result | ||
#' column(s) by attaching this prefix to the corresponding input column(s). | ||
#' Some shorthand is supported for basing the output names on `.window_size` | ||
#' or other arguments; see "Prefix and suffix shorthand" below. | ||
#' @param .suffix Optional [`glue::glue`] format string; like `.prefix`. The | ||
#' default naming behavior is equivalent to `.suffix = | ||
#' "_{.n}{.time_unit_abbr}{.align_abbr}{.f_abbr}"`. Can be used in combination | ||
#' with `.prefix`. | ||
#' @param .new_col_names Optional character vector with length matching the | ||
#' number of input columns from `.col_names`; name the slide result column(s) | ||
#' with these names. Cannot be used in combination with `.prefix` and/or | ||
#' `.suffix`. | ||
#' | ||
#' @section Prefix and suffix shorthand: | ||
#' | ||
#' [`glue::glue`] format strings specially interpret content within curly | ||
#' braces. E.g., `glue::glue("ABC{2 + 2}")` evaluates to `"ABC4"`. For `.prefix` | ||
#' and `.suffix`, we provide `glue` with some additional variable bindings: | ||
#' | ||
#' - `{.n}` will be the number of time steps in the computation | ||
#' corresponding to the `.window_size`. | ||
#' - `{.time_unit_abbr}` will be a lower-case letter corresponding to the | ||
#' `time_type` of `.x` | ||
#' - `{.align_abbr}` will be `""` if `.align` is the default of `"right"`; | ||
#' otherwise, it will be the first letter of `.align` | ||
#' - `{.f_abbr}` will be a character vector containing a short abbreviation | ||
#' for `.f` factoring in the input column type(s) for `.col_names` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: please move this section to man-roxygen/basic-slide-params.R
so it can be included here and for the epi_slide_[mean/opt]
docs too.
man-roxygen/basic-slide-params.R
also has grouping/ungrouping language that might need to be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can't be in basic-slide-params.R, because it's not shared by epi_slide
. It's already shared by {opt,mean,sum} because they are all in one topic. I'm not adding .prefix and .suffix to epi[x]_slide
(yet at least), since it's actually a bit distinct: epi_slide_{opt,mean,sum}
are applying these to input columns to name associated output columns, while epi[x]_slide
would be applying these to names of columns generated by the user computation. #534 suggests this feature might be appreciated though.
Because {opt,mean,sum} are all in one topic, the examples are a bit redundant + the mean & sum examples are smashed next to each other without spacing/separation.
-
todo: fix smashing or just put all the examples in epi_slide_opt and remove redundant ones.
-
todo: update basic-slide-params.R grouping language
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
basic-slide-params.R already had the appropriate language regarding groupedness of .x
(that was probably a doc bug before this PR, but this PR makes it true). I did update it to mention groupedness/ungroupedness of the output.
I reworked the examples, deleting some, reordering others, tweaking discussion, maybe added another. Could benefit from a sanity check.
) | ||
} | ||
|
||
# Grouping should ensure that we don't have duplicate time values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: How does grouping ensure lack of duplicates?
It's not clear to me that this is true (or at least, we're handling duplicates differently in different places, e.g. epi_slide
does check for duplicated time_value
s, despite now grouping by the same keys as ...opt
does).
I know we check for duplicate keys in as_epi_df.tbl_df
, but I don't see it anywhere else for epi_df
s. dplyr_reconstruct.epi_df
checks for duplicated column names, but not anything else. Notably, new_epi_df
doesn't check for duplicates. It is described as being a "lower-level constructor", but I think a user could conceivably find and use it, and end up with an incorrect result here.
Maybe the issue is just that new_epi_df
is exported when it needn't be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooh, right, it currently doesn't completely ensure it. Dmitry tested out placing a check in dplyr_reconstruct
but it was too slow for how often it runs. I've looked into a couple of ways to address this:
-
Fixing our
dplyr_extending
usage which is pretty wonky, so we wouldn't run checks as often. (experimentation here, incomplete) -
Making the check faster (Speed up duplicate detection in
as_epi_df()
#560)
I didn't complete those, so -
todo: add back duplicated time values check for now
new_epi_df
shouldn't validate something complicated like this if we try to follow Hadley recommendations. Related: #312.
I'm not sure about not exporting new_epi_df
; #94 suggests it's needed downstream, but I don't actually see it anywhere in epipredict right now. We should probably at least provide a helper epi_df
that acts like tibble
, which would be convenient and make new_epi_df
not stand out as much. Filed off into #572.
Co-authored-by: nmdefries <[email protected]>
Co-authored-by: nmdefries <[email protected]>
Checklist
Please:
PR).
brookslogan, nmdefries.
DESCRIPTION
. Always incrementthe patch version number (the third number), unless you are making a
release PR from dev to main, in which case increment the minor version
number (the second number).
(backwards-incompatible changes to the documented interface) are noted.
Collect the changes under the next release number (e.g. if you are on
1.7.2, then write your changes under the 1.8 heading).
process.
Change explanations for reviewer
Magic GitHub syntax to mark associated Issue(s) as resolved when this is merged into the default branch
.suffix
(and.prefix
?) arguments to slides #542.epi_slide_opt
work with!!varname
, maybe!!varnames
, maybe!!!varnames
#562