-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2481 bug the result of derive param tte depends on the sort order of the input #2569
base: main
Are you sure you want to change the base?
2481 bug the result of derive param tte depends on the sort order of the input #2569
Conversation
…ignal_duplicate_records to derive_param_tte. Still troubleshooting the test-derive_param_tte script. Failed tests have a "Required variable `AEDECOD` is missing in `dataset`" error.
…9, 15, and 16 in test-derive_param_tte
…_tte to fix missing data error
I'll need to some help reviewing the test scripts I wrote for this change. I haven't written any myself to this degree. I'm also still seeing some errors when I still build_site and document() the package but maybe I implemented the changes incorrectly? @bundfussr @bms63 |
R/derive_joined.R
Outdated
@@ -478,7 +478,7 @@ derive_vars_joined <- function(dataset, | |||
derive_var_obs_number( | |||
new_var = !!tmp_obs_nr, | |||
by_vars = by_vars_left, | |||
check_type = "none" | |||
"none" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come this argument got dropped?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure. Let me add it back in. I did a fresh pull before working on this again but I likely made a mistake
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its still there on my end. I'll push this again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewing to see if I made any other mistakes.
Pushing again and confirmed check_type argument is in derive_var_obs_number in derive_joined.R scripts
You need to fix this conflict. You might need to accept the updated snapshot |
PARAM = past("Time to First", AEDECOD, "Adverse Event"), | ||
PARAM = paste("Time to First", AEDECOD, "Adverse Event"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't change this. The mistake is intentional to cause an error. Now the test is failing because there is no error anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I thought I had changed this by mistake when getting errors back. I'll change it back. Thanks.
R/derive_param_tte.R
Outdated
# check for duplicates in event_data | ||
signal_duplicate_records( | ||
dataset = event_data, | ||
by_vars = expr_c(by_vars, subject_keys), | ||
cnd_type = check_type | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check will never fail because filter_date_sources()
returns at most one records per by group and subject. That's the reason why the test "derive_param_tte detects duplicates when check_type = 'warning'" fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I get it. I just made a push, but let me take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, let me know if this was effective. The error I was getting was that AEDECOD was not being included in the signal_duplicate_records function. So I bound dataset_adsl and source_datasets together then ran the check within derive_param_tte like so:
#check for duplicates in dataset_adsl and source_datasets
combined_dataset <- bind_rows(dataset_adsl, !!!source_datasets)
signal_duplicate_records(
dataset = combined_dataset,
by_vars = expr_c(subject_keys, by_vars),
cnd_type = check_type
)
All the tests have passed now with some additional warnings. I have to run to some meetings and seminars today but am pushing now for review and can check back in the evening. On first glance not sure if the earlier tests should get a duplicate warning now as well.
…om test-derive_param_tte as it was redundant, and ran pharmaverse4devs format test script addin to format testest-derive_param_tte.
…nds-on-the-sort-order-of-the-input
…le dataset_adsl and source_datasets by combining them with bind_rows before to address error of AEDECOD missing from the dataset when just calling dataset_adsl. This starts on line 381 of derive_param_tte.R
…sort-order-of-the-input' of https://github.com/pharmaverse/admiral into 2481-bug-the-result-of-derive_param_tte-depends-on-the-sort-order-of-the-input
…ess failed runs in Test 16 of test-derive_param_tte. removed signal_duplicate_records() from within derive_param_tte Still need to troubleshoot errors in test script.
…ote Test 15 and 16 on test-derive_param_tte to deal with update to duplicate warnings within tryCatch and not directly by signal_duplicate_records inside derive_param_tte function. Accepted snapshots from devtools::check
Hey @bms63 accepted the snapshots and passed the dev tools::check() on my end but it fails when pushed. I noticed in the failure output that the snapshots I accepted didn't seem to go through. Did I miss something? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
order = exprs(!!source_date_var), | ||
by_vars = expr_c(subject_keys, by_vars), | ||
mode = mode, | ||
check_type = check_type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment check_type = "message"
is not accepted by filter_extreme()
. Could you change this?
arrange(!!!sources[[i]]$order) %>% # Ensure order is applied | ||
filter_extreme( | ||
order = exprs(!!source_date_var), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The order
specified for the tte_source
object should be added to the order
argument in the filter_extreme()
call and it should be after !!source_date_var
. This ensures that the records are ordered first by date and that all order variables are displayed in the message in case that there are duplicates.
#' ) | ||
filter_date_sources <- function(sources, | ||
source_datasets, | ||
by_vars, | ||
create_datetime = FALSE, | ||
subject_keys, | ||
mode) { | ||
mode, | ||
check_type = "none") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new argument should be added to the documentation.
conditionMessage(wrn) | ||
), call. = FALSE) | ||
} | ||
return(source_dataset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function should not return source_dataset
but but the filtered source dataset.
}, | ||
warning = function(wrn) { | ||
if (grepl("duplicate records", conditionMessage(wrn))) { | ||
warning(sprintf( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cli_warn()
should be used instead of warning()
.
The warning is duplicated, e.g., "Test 15" produces
Warning messages:
1: Dataset 'ae' contains duplicate records: Dataset contains duplicate records with respect to `STUDYID`, `USUBJID`, and `AESTDT`
ℹ Run admiral::get_duplicates_dataset() to access the duplicate records
2: Dataset contains duplicate records with respect to `STUDYID`, `USUBJID`, and `ADT`
ℹ Run admiral::get_duplicates_dataset() to access the duplicate records
Could you update such that we get
Warning message:
Dataset 'ae' contains duplicate records with respect to `STUDYID`, `USUBJID`, and `AESTDT`
ℹ Run admiral::get_duplicates_dataset() to access the duplicate records
?
check_type = check_type | ||
) | ||
}, | ||
warning = function(wrn) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This catches warnings only. Errors and messages should be handled as well. (Maybe https://rlang.r-lib.org/reference/try_fetch.html is helpful.)
|
||
- `check_type = "warning"` default argument added to `derive_param_tte` with an | ||
`arg_match` function within the function so the user can use a valid input of | ||
`error, message, warning, or none`. `signal_duplicate_records()` has also been | ||
added to the function on lines 394 and 411 to check for uniqueness of records. (#2481) | ||
|
||
- `order()` function has been added to `event_source()` and `censor_source()` and | ||
defaulted to `NULL` to allow sorting of input data. (#2481) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be moved up to the "development version" section because admiral 1.1.1 is already released.
- `order()` function has been added to `event_source()` and `censor_source()` and | ||
defaulted to `NULL` to allow sorting of input data. (#2481) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- `order()` function has been added to `event_source()` and `censor_source()` and | |
defaulted to `NULL` to allow sorting of input data. (#2481) | |
- `order` argument has been added to `event_source()` and `censor_source()` and | |
defaulted to `NULL` to allow specifying variables in addition to the date variable. | |
This can be used to ensure the uniqueness of the select records if there is more | |
than one record per date. (#2481) |
- `check_type = "warning"` default argument added to `derive_param_tte` with an | ||
`arg_match` function within the function so the user can use a valid input of | ||
`error, message, warning, or none`. `signal_duplicate_records()` has also been | ||
added to the function on lines 394 and 411 to check for uniqueness of records. (#2481) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you rephrase this item such that it describes how the change affects the users? No need to add details for developers.
subject_keys = get_admiral_option("subject_keys"), | ||
check_type = "warning") { | ||
# Match check_type to valid admiral options | ||
check_type <- rlang::arg_match(check_type, c("warning", "message", "error", "none")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use the admiraldev assertions (assert_character_scalar()
) and move it to the "checking and quoting" section below.
Thank you for your Pull Request! We have developed this task checklist from the Development Process Guide to help with the final steps of the process. Completing the below tasks helps to ensure our reviewers can maximize their time on your code as well as making sure the admiral codebase remains robust and consistent.
Please check off each taskbox as an acknowledgment that you completed the task or check off that it is not relevant to your Pull Request. This checklist is part of the Github Action workflows and the Pull Request will not be merged into the
main
branch until you have checked off each task.styler::style_file()
to style R and Rmd filesinst/cheatsheet/admiral_cheatsheet.pptx
and re-upload a PDF and a PNG version of it to the same folder. (The PNG version can be created by taking a screenshot of the PDF version.)devtools::document()
so all.Rd
files in theman
folder and theNAMESPACE
file in the project root are updated appropriatelyNEWS.md
under the header# admiral (development version)
if the changes pertain to a user-facing function (i.e. it has an@export
tag) or documentation aimed at users (rather than developers). A Developer Notes section is available inNEWS.md
for tracking developer-facing issues.pkgdown::build_site()
and check that all affected examples are displayed correctly and that all new functions occur on the "Reference" page.lintr::lint_package()
R CMD check
locally and address all errors and warnings -devtools::check()