-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support v3.0.0 schema sample specification #19
Conversation
R/validate-config-utils.R
Outdated
# Validate that compound_taskid_set values are valid task ids for a | ||
# given modeling task group in a given round. | ||
# Returns NULL if not applicable or check passes and error df row if check fails. | ||
validate_mt_sample_comp_tids <- function(model_task_grp, | ||
model_task_i, | ||
round_i, | ||
schema) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Questions about what's going on here and in check_compound_taskids_valid
:
- The idea is that
check_compound_taskids_valid
is used when creating a new config file, but this function is used when validating an existing config file? - There may be value in standardizing on naming conventions:
- it's not clear to me why we've used
validate
here andcheck
in the name of the other function. Thecheck
function is actuallyabort
ing, while this one generates an error message in a data frame that is collected for later use? Are there standard conventions for this kind of thing? I went and looked atcheckmate
docs, but they don't have things starting withvalidate
. - noting
comp_tids
here vs.compound_taskids
in the other function name
- it's not clear to me why we've used
- would there be any value or possibility to refactor the check logic here and in that other function? It's not much code and it's not exactly identical, so maybe not. But it is quite similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So there's two places that dynamic validations that can't be encoded in our schema need to be performed:
- when validating a
tasks.json
file (validate_mt_sample_comp_tids()
). I usedvalidate
here because it is part of the config validation process which itself is based on the concept of using json schema to validate json data, powered by the packagejsonvalidate
. The output of thevalidate
function conform to the output of thejsonvalidate::json_validate()
, specifically theerrors
attribute tbl which can then be passed toview_config_val_errors()
. - when checking inputs to functions that create config programmatically. Because we are checking inputs to a function effectively, here I opted for check. e.g. a la
rlang:: check_rquired()
etc which indeed is designed to throw an error if the condition is not met.
I believe different languages have related but slightly different conventions (e.g. python uses assert
in testing while we use expect
) and I don't know of any resource is R that dictate the use of specific verbs for specific situations (I believe the checkmate convention is just their own internal convention). As such I've just tried to keep these internally consistent to separate functions that are used for validation of json vs functions used to check inputs to functions generating config. I've also kept such functions separate because although they might be checking the same thing, the inputs and outputs/side effects of each function are so different and take up so much of the logic in each function that trying to shoehorn them into a single function didn't make sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks! all sounds good. the one remaining question then is about comp_tids
vs. compound_taskids
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 33a4859
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This looks good, I have made a couple of minor comments.
Additionally, I noticed that on lines 146-150 of R/create_output_type_item.R
, we have the following docstring with some duplication (github didn't let me comment on this or suggest changes as part of this review because this PR didn't introduce this, but I thought maybe we could just clean it up here):
#' This can be combined with other building blocks which can then be written as
#' or appended to `tasks.json` Hub config files.
#' output type.
#' This can be combined with other building blocks which can then be written as
#' or appended to `tasks.json` Hub config files.
Co-authored-by: Evan Ray <[email protected]>
Co-authored-by: Evan Ray <[email protected]>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #19 +/- ##
==========================================
+ Coverage 87.23% 87.92% +0.69%
==========================================
Files 20 21 +1
Lines 1637 1830 +193
==========================================
+ Hits 1428 1609 +181
- Misses 209 221 +12 ☔ View full report in Codecov by Sentry. |
Thanks! Fixed this in f026d6c |
…f basic schema validation passes
…dynamic-vals-note 21/22/ Add note about two stage validation / remove hard coded location data
Co-authored-by: Lucie Contamin <[email protected]>
Rename org name
This PR will:
validate_confi()
. Specifically, check min_ is less than os eual to and thatcompund_taskid
set is match valid model task task IDs. Resolves Dynamic validations of sample output types #17create_output_type_sample()
to handle v3.0.0 schema. In particular,create_output_type_sample()
now takes arguments incompatible with previous schema versions and returns an object with anoutput_type_id_params
object instead ofoutput_type_id
. Resolves Add newcreate_output_type_sample()
to handle new v3.0.0 schema specification #18latest
on main branch when add v3.0.0 schema with initial sample schema schemas#72 mergedFor further details and context see here