Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2481 bug the result of derive param tte depends on the sort order of the input #2569

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

ProfessorP-beep
Copy link
Collaborator

@ProfessorP-beep ProfessorP-beep commented Nov 18, 2024

Thank you for your Pull Request! We have developed this task checklist from the Development Process Guide to help with the final steps of the process. Completing the below tasks helps to ensure our reviewers can maximize their time on your code as well as making sure the admiral codebase remains robust and consistent.

Please check off each taskbox as an acknowledgment that you completed the task or check off that it is not relevant to your Pull Request. This checklist is part of the Github Action workflows and the Pull Request will not be merged into the main branch until you have checked off each task.

  • Place Closes #<insert_issue_number> into the beginning of your Pull Request Title (Use Edit button in top-right if you need to update)
  • Code is formatted according to the tidyverse style guide. Run styler::style_file() to style R and Rmd files
  • Updated relevant unit tests or have written new unit tests, which should consider realistic data scenarios and edge cases, e.g. empty datasets, errors, boundary cases etc. - See Unit Test Guide
  • If you removed/replaced any function and/or function parameters, did you fully follow the deprecation guidance?
  • Review the Cheat Sheet. Make any required updates to it by editing the file inst/cheatsheet/admiral_cheatsheet.pptx and re-upload a PDF and a PNG version of it to the same folder. (The PNG version can be created by taking a screenshot of the PDF version.)
  • Update to all relevant roxygen headers and examples, including keywords and families. Refer to the categorization of functions to tag appropriate keyword/family.
  • Run devtools::document() so all .Rd files in the man folder and the NAMESPACE file in the project root are updated appropriately
  • Address any updates needed for vignettes and/or templates
  • Update NEWS.md under the header # admiral (development version) if the changes pertain to a user-facing function (i.e. it has an @export tag) or documentation aimed at users (rather than developers). A Developer Notes section is available in NEWS.md for tracking developer-facing issues.
  • Build admiral site pkgdown::build_site() and check that all affected examples are displayed correctly and that all new functions occur on the "Reference" page.
  • Address or fix all lintr warnings and errors - lintr::lint_package()
  • Run R CMD check locally and address all errors and warnings - devtools::check()
  • Link the issue in the Development Section on the right hand side.
  • Address all merge conflicts and resolve appropriately
  • Pat yourself on the back for a job well done! Much love to your accomplishment!

…ignal_duplicate_records to derive_param_tte.

Still troubleshooting the test-derive_param_tte script. Failed tests have a "Required variable `AEDECOD` is missing in `dataset`" error.
@ProfessorP-beep
Copy link
Collaborator Author

ProfessorP-beep commented Nov 18, 2024

I'll need to some help reviewing the test scripts I wrote for this change. I haven't written any myself to this degree. I'm also still seeing some errors when I still build_site and document() the package but maybe I implemented the changes incorrectly? @bundfussr @bms63

@@ -478,7 +478,7 @@ derive_vars_joined <- function(dataset,
derive_var_obs_number(
new_var = !!tmp_obs_nr,
by_vars = by_vars_left,
check_type = "none"
"none"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come this argument got dropped?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure. Let me add it back in. I did a fresh pull before working on this again but I likely made a mistake

Copy link
Collaborator Author

@ProfessorP-beep ProfessorP-beep Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its still there on my end. I'll push this again.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewing to see if I made any other mistakes.

Pushing again and confirmed check_type argument is in derive_var_obs_number in derive_joined.R scripts
@bms63
Copy link
Collaborator

bms63 commented Nov 18, 2024

You need to fix this conflict. You might need to accept the updated snapshot

PARAM = past("Time to First", AEDECOD, "Adverse Event"),
PARAM = paste("Time to First", AEDECOD, "Adverse Event"),
Copy link
Collaborator

@bundfussr bundfussr Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't change this. The mistake is intentional to cause an error. Now the test is failing because there is no error anymore.

Copy link
Collaborator Author

@ProfessorP-beep ProfessorP-beep Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I thought I had changed this by mistake when getting errors back. I'll change it back. Thanks.

Comment on lines 393 to 398
# check for duplicates in event_data
signal_duplicate_records(
dataset = event_data,
by_vars = expr_c(by_vars, subject_keys),
cnd_type = check_type
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check will never fail because filter_date_sources() returns at most one records per by group and subject. That's the reason why the test "derive_param_tte detects duplicates when check_type = 'warning'" fails.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I get it. I just made a push, but let me take a look.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, let me know if this was effective. The error I was getting was that AEDECOD was not being included in the signal_duplicate_records function. So I bound dataset_adsl and source_datasets together then ran the check within derive_param_tte like so:

#check for duplicates in dataset_adsl and source_datasets
combined_dataset <- bind_rows(dataset_adsl, !!!source_datasets)

signal_duplicate_records(
dataset = combined_dataset,
by_vars = expr_c(subject_keys, by_vars),
cnd_type = check_type
)

All the tests have passed now with some additional warnings. I have to run to some meetings and seminars today but am pushing now for review and can check back in the evening. On first glance not sure if the earlier tests should get a duplicate warning now as well.

tests/testthat/test-derive_param_tte.R Outdated Show resolved Hide resolved
tests/testthat/test-derive_param_tte.R Outdated Show resolved Hide resolved
tests/testthat/test-derive_param_tte.R Outdated Show resolved Hide resolved
tests/testthat/test-derive_param_tte.R Outdated Show resolved Hide resolved
ProfessorP-beep and others added 4 commits November 19, 2024 10:54
…om test-derive_param_tte as it was redundant, and ran pharmaverse4devs format test script addin to format testest-derive_param_tte.
…le dataset_adsl and source_datasets by combining them with bind_rows before to address error of AEDECOD missing from the dataset when just calling dataset_adsl. This starts on line 381 of derive_param_tte.R
…sort-order-of-the-input' of https://github.com/pharmaverse/admiral into 2481-bug-the-result-of-derive_param_tte-depends-on-the-sort-order-of-the-input
tests/testthat/test-derive_param_tte.R Outdated Show resolved Hide resolved
R/derive_param_tte.R Outdated Show resolved Hide resolved
…ess failed runs in Test 16 of test-derive_param_tte.

removed signal_duplicate_records() from within derive_param_tte

Still need to troubleshoot errors in test script.
…ote Test 15 and 16 on test-derive_param_tte to deal with update to duplicate warnings within tryCatch and not directly by signal_duplicate_records inside derive_param_tte function.

Accepted snapshots from devtools::check
@ProfessorP-beep
Copy link
Collaborator Author

You need to fix this conflict. You might need to accept the updated snapshot

Hey @bms63 accepted the snapshots and passed the dev tools::check() on my end but it fails when pushed. I noticed in the failure output that the snapshots I accepted didn't seem to go through. Did I miss something?

Copy link
Collaborator

@bms63 bms63 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like a different test is falling over
image

order = exprs(!!source_date_var),
by_vars = expr_c(subject_keys, by_vars),
mode = mode,
check_type = check_type
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment check_type = "message" is not accepted by filter_extreme(). Could you change this?

Comment on lines +630 to +632
arrange(!!!sources[[i]]$order) %>% # Ensure order is applied
filter_extreme(
order = exprs(!!source_date_var),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order specified for the tte_source object should be added to the order argument in the filter_extreme() call and it should be after !!source_date_var. This ensures that the records are ordered first by date and that all order variables are displayed in the message in case that there are duplicates.

#' )
filter_date_sources <- function(sources,
source_datasets,
by_vars,
create_datetime = FALSE,
subject_keys,
mode) {
mode,
check_type = "none") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new argument should be added to the documentation.

conditionMessage(wrn)
), call. = FALSE)
}
return(source_dataset)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function should not return source_dataset but but the filtered source dataset.

},
warning = function(wrn) {
if (grepl("duplicate records", conditionMessage(wrn))) {
warning(sprintf(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cli_warn() should be used instead of warning().

The warning is duplicated, e.g., "Test 15" produces

Warning messages:
1: Dataset 'ae' contains duplicate records: Dataset contains duplicate records with respect to `STUDYID`, `USUBJID`, and `AESTDT`
ℹ Run admiral::get_duplicates_dataset() to access the duplicate records 
2: Dataset contains duplicate records with respect to `STUDYID`, `USUBJID`, and `ADT`
ℹ Run admiral::get_duplicates_dataset() to access the duplicate records 

Could you update such that we get

Warning message:
Dataset 'ae' contains duplicate records with respect to `STUDYID`, `USUBJID`, and `AESTDT`
ℹ Run admiral::get_duplicates_dataset() to access the duplicate records 

?

check_type = check_type
)
},
warning = function(wrn) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This catches warnings only. Errors and messages should be handled as well. (Maybe https://rlang.r-lib.org/reference/try_fetch.html is helpful.)

Comment on lines +83 to +91

- `check_type = "warning"` default argument added to `derive_param_tte` with an
`arg_match` function within the function so the user can use a valid input of
`error, message, warning, or none`. `signal_duplicate_records()` has also been
added to the function on lines 394 and 411 to check for uniqueness of records. (#2481)

- `order()` function has been added to `event_source()` and `censor_source()` and
defaulted to `NULL` to allow sorting of input data. (#2481)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be moved up to the "development version" section because admiral 1.1.1 is already released.

Comment on lines +89 to +90
- `order()` function has been added to `event_source()` and `censor_source()` and
defaulted to `NULL` to allow sorting of input data. (#2481)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `order()` function has been added to `event_source()` and `censor_source()` and
defaulted to `NULL` to allow sorting of input data. (#2481)
- `order` argument has been added to `event_source()` and `censor_source()` and
defaulted to `NULL` to allow specifying variables in addition to the date variable.
This can be used to ensure the uniqueness of the select records if there is more
than one record per date. (#2481)

Comment on lines +84 to +87
- `check_type = "warning"` default argument added to `derive_param_tte` with an
`arg_match` function within the function so the user can use a valid input of
`error, message, warning, or none`. `signal_duplicate_records()` has also been
added to the function on lines 394 and 411 to check for uniqueness of records. (#2481)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you rephrase this item such that it describes how the change affects the users? No need to add details for developers.

subject_keys = get_admiral_option("subject_keys"),
check_type = "warning") {
# Match check_type to valid admiral options
check_type <- rlang::arg_match(check_type, c("warning", "message", "error", "none"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the admiraldev assertions (assert_character_scalar()) and move it to the "checking and quoting" section below.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: The result of derive_param_tte() depends on the sort order of the input
3 participants