-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle Missing Data Codes #182
Conversation
A few things here, in general I think the actual code and handling is sound and I'll separate out my thoughts below. Warning RedundancyWould it be possible to consolidate the warnings so that one warning encompasses all fields? Or at least each type (the logical versus extra fields checks)? Right now the setup makes a separate warning for each field, which would make for a lot of warnings in most cases where this comes up. I added a second yesno field with an UNK MDC to show below, you can remove it if you like: read_redcap(redcap_uri = Sys.getenv("REDCAP_URI"),
+ token = Sys.getenv("REDCAPTIDIER_MDC_API"), raw_or_label = "label")$redcap_data[[1]]
# A tibble: 3 × 9
record_id yesno yesno2 text checkbox___1 checkbox___2 checkbox___3 dropdown form_status_complete
<dbl> <lgl> <lgl> <chr> <lgl> <lgl> <lgl> <fct> <fct>
1 1 TRUE NA text TRUE TRUE FALSE C Complete
2 2 NA NA UNK FALSE FALSE FALSE NA Complete
3 3 NA NA NA FALSE FALSE FALSE NA Incomplete
Warning messages:
1: In read_redcap(redcap_uri = Sys.getenv("REDCAP_URI"), token = Sys.getenv("REDCAPTIDIER_MDC_API"), :
! `yesno` is type 'yesno' but contains non-logical values: UNK
ℹ These were converted to `NA` resulting in possible data loss
ℹ Does your REDCap project utilize missing data codes?
ℹ Silence this warning with `options(redcaptidier.allow.mdc = TRUE)` or set `raw_or_label = 'raw'` to access missing data codes
2: In read_redcap(redcap_uri = Sys.getenv("REDCAP_URI"), token = Sys.getenv("REDCAPTIDIER_MDC_API"), :
! `yesno2` is type 'yesno' but contains non-logical values: UNK
ℹ These were converted to `NA` resulting in possible data loss
ℹ Does your REDCap project utilize missing data codes?
ℹ Silence this warning with `options(redcaptidier.allow.mdc = TRUE)` or set `raw_or_label = 'raw'` to access missing data codes
3: In read_redcap(redcap_uri = Sys.getenv("REDCAP_URI"), token = Sys.getenv("REDCAPTIDIER_MDC_API"), :
! `dropdown` contains values with no labels: UNK
ℹ These were converted to `NA` resulting in possible data loss
ℹ Does your REDCap project utilize missing data codes?
ℹ Silence this warning with `options(redcaptidier.allow.mdc = TRUE)` or set `raw_or_label = 'raw'` to access missing data codes I think aesthetically something similar to the warning we had for mixed-structure data would be what to aim for: > read_redcap(redcap_uri = Sys.getenv("REDCAP_URI"),
+ token = Sys.getenv("REDCAPTIDIER_MIXED_STRUCTURE_API"))
Error in `clean_redcap_long()` at REDCapTidieR/R/read_redcap.R:278:5:
✖ Instruments detected that have both repeating and nonrepeating instances defined in the project: mixed_structure_1 and
mixed_structure_form_complete
ℹ Set `allow_mixed_structure` to `TRUE` to override. See Mixed Structure Instruments for more information.
Run `rlang::last_trace()` to see where the error occurred. Raw vs Label DiscrepancyI might be wrong here, but we may have the raw/label order backwards? See below for the output from REDCapR: > redcap_read_oneshot(redcap_uri = Sys.getenv("REDCAP_URI"),
+ token = Sys.getenv("REDCAPTIDIER_MDC_API"), raw_or_label = "label")$data
3 records and 10 columns were read from REDCap in 1.7 seconds. The http status code was 200.
record_id yesno yesno2 text checkbox___1 checkbox___2 checkbox___3 checkbox___unk dropdown form_1_complete
1 1 Yes <NA> text Checked Checked Unchecked Unchecked C Complete
2 2 Unknown Unknown Unknown Unchecked Unchecked Unchecked Checked Unknown Complete
3 3 <NA> <NA> <NA> Unchecked Unchecked Unchecked Unchecked <NA> Incomplete Documentation SupportWould you also mind adding something to this PR that mentions this in one of the vignettes/articles? At the moment the only way a user would know about the |
@rsh52 give this another look once the CI passes. I consolidated the warnings and updated the vignette. As discussed earlier, there isn't a way of getting "Unknown" instead of UNK without reading the project metadata so I left that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! Thanks for making those changes and docs additions. Looks good from my end, I still think at some point we may want to pivot to using reading project info from REDCapR but this will solve the issue now.
Description
This PR updates our handling of redcap projects using missing data codes. The new behavior is:
raw_or_label = "raw"
missing data codes kept in the data as it comes back from the APINA
inyesno
,truefalse
, andcheckbox
fields with a warning if this occursNA
indropdown
andradio
fields with a warning if this occursoptions(redcaptidier.allow.mdc = TRUE)
can be set to silence these warningsBenchmarks initially looked like we had increased run time for two of our redcaps but that turned out to be a false positive after increasing the number of microbenchmark iterations.
Proposed Changes
check_field_is_logical()
check and refactormulti_choice_to_labels()
to incorporate it. This handles checking and parsing to logicalcheck_extra_field_values()
check and add tomulti_choice_to_labels()
Screenshots
Logical field warning:
Categorical field warning:
Issue Addressed
Relates to #181
PR Checklist
Before submitting this PR, please check and verify below that the submission meets the below criteria:
.RDS
) updated underinst/testdata/create_test_data.R
usethis::use_version()
Code Review
This section to be used by the reviewer and developers during Code Review after PR submission
Code Review Checklist