-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Mixed Repeat/Nonrepeat Instrument Support #177
Conversation
Addresses codecov warning
fixed a bug where existing >1 repeating instance were getting overwritten as 1
I think this is ready for review now, if I can get @ezraporter to review code (and the rest 😄 ) and @skadauke to review the vignette / glossary updates and the error message: (note that the "Mixed Structure Instruments" hyperlink doesn't work yet until we've published this) |
I like the glossary, vignette, and error message updates! 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is complicated and I'm having a little trouble wrapping my head around what's going on. I tried to add comments where I was getting confused.
It looks like we're losing data in the output though. The test redcap has 5 responses:
- Nonrepeat 1
- Nonrepeat 2
- Mixed Nonrepeat 1
- Mixed Repeat 1
- Mixed Repeat 2
Those first 2 should show up in nonrepeat_form
but the redcap_data
there has only "Nonrepeat 2":
devtools::load_all()
data <- read_redcap(
Sys.getenv("REDCAP_URI"),
Sys.getenv("REDCAPTIDIER_MIXED_STRUCTURE_API"),
enable_mixed_structure = TRUE
) |>
extract_tibble("nonrepeat_form")
data$nonrepeat_1
#> [1] "Nonrepeat 2"
The data for mixed_structure_form
looks okay.
TODO Fix structure label assignment in supertibble
Minor cleaning, updating of convert_mixed_instrument function
Fix case_when issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I've 99% convinced myself this works and I actually think your solution is really good. I say we:
- Run our timing benchmarks to make sure the latest change didn't mess things up. (Can we also add identifiers to the
microbenchmark_results.csv
so it's easy to line things up when we add new redcaps?) - Merge this and ask for the issue opener to test the dev version
db_data_long, | ||
db_metadata_long, | ||
linked_arms, | ||
has_mixed_structure_forms = has_mixed_structure_forms, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine but noting that one implication of this is that convert_mixed_instrument()
gets run for every instrument even if it's actually repeating rather than mixed. It works because because mixed_structure_ref
gets filtered to 0 rows for repeating instruments and the for
loop in convert_mixed_instrument()
doesn't run. We do still need to filter()
mixed_structure_ref
every time to find that out though which could be a performance hit. No need to optimize until it's a problem though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it? The actual conversion is wrapped in a check on has_mixed_structure_forms
so if FALSE
(which it would be for the whole map()
sequence) then convert_mixed_instrument()
wouldn't get run unless I'm missing something.
REDCapTidieR/R/clean_redcap_long.R
Lines 315 to 317 in 65d5baf
if (has_mixed_structure_forms) { | |
db_data_long <- convert_mixed_instrument(db_data_long, mixed_structure_ref %>% filter(form_name == my_form)) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right but if you have mixed and repeating instruments then has_mixed_structure_forms
is TRUE
and it gets run for all the repeating instruments in addition to the mixed instruments.
Co-authored-by: Ezra Porter <[email protected]>
Updated run results and added column outputs for the database description and source (ouhsc / redcaptidier). |
Thanks! |
Description
This PR seeks to add a new parameter to
read_redcap()
which will allow users to override the check in place that stops data exports for REDCap projects where instruments are detected to be both repeating and nonrepeating. The background rationale for not allowing this was to keep to tidy data principles.For the purposes of this PR and documentation, here are my definitions for some things:
structure
, and has only ever been "nonrepeating" / "repeating" for values. I wanted to keep this terminology so some of the updated functions use "structure" / "mixed structure"Proposed Changes
List changes below in bullet format:
enable_repeat_nonrepeat
toread_readcap()
and lower-level functionclean_redcap_long()
check_repeat_and_nonrepeat()
to new functionget_mixed_structure_fields()
get_mixed_structure_fields()
in handler function for converting nonrepeat parts of mixed structure instruments to repeating ones with a single instance viaconvert_mixed_instrument()
test_creds.R
(We now have a new database where a new env variable is needed:REDCAPTIDIER_MIXED_STRUCTURE_API
)structure
column in supertibble to say "mixed" for mixed structure instrumentsRemaining TODOs:
Determine if we should add a new category tostructure
in the supertibbleUpdate Diving Deeper vignette/article (holding off until we're happy with logic choices and syntax)Issue Addressed
Addresses #126
Addresses #169
PR Checklist
Before submitting this PR, please check and verify below that the submission meets the below criteria:
.RDS
) updated underinst/testdata/create_test_data.R
usethis::use_version()
Code Review
This section to be used by the reviewer and developers during Code Review after PR submission
Code Review Checklist