You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In addition to extracting selected tibbles, extract_tibbles should allow users the option to join them as a singular tibble output instead of the list. As found in recent projects, the next logical step often times when using extract_tibbles is joining.
Proposed Solution
Prototyped logic is available in our internal Prodigy Reporter. The new argument (suggest: join_tibbles = TRUE/FALSE) should kick off join operations. Since we abstract some column names, i.e. form_status_complete, we need to account for duplicated colnames in the tibbles themselves.
# Load Libraries ===============================================================
library(REDCapTidieR)
library(tidyverse)
library(tidyselect)
library(rlang)
# tibble List Selection Function ===============================================tibble_list_select<-function(supertibble, tbls) {
tbls<- eval_select(data=supertibble, expr= enquo(tbls))
supertibble[tbls]
}
# Join Operation ===============================================================join_tibbles<-function(extracted_tibbles, record_id) {
# First: compile all names related to tibbles# Second: Identify names that exist in multiple tibbles (not record_id)# Third: Append identified names with name of the tibble they belong toduplicate_colnames<-extracted_tibbles %>%
map(names) %>%
unlist() %>%
tibble(name=.) %>%
count(name) %>%
# don't append table name to pk: infseq_id
filter(n>1&name!=record_id) %>% # <-- Need to functionally call out record_id in case of name change -->
pull(name)
extracted_tibbles<- map2(
extracted_tibbles,
names(extracted_tibbles),
.f=function(df, df_name) {
# [duplicate_col] -> [duplicate_col].[table_name]
rename_with(
df,
.cols= any_of(duplicate_colnames),
.fn=function(col) paste0(col, ".", df_name)
)
}
)
# Multi-left_join using reduce, filter for inputs resulting in include == TRUEout<- reduce(
extracted_tibbles,
dplyr::left_join,
by=record_id# <-- Need to functionally update this -->
)
out
}
Here's how I envision this being implemented, but imagine the external functions as internal to extract_tibbles instead:
You should be able to copy and paste all of this into a script and use REDCapTidieR 0.2.0 to view the proposed output. Open to suggestions on naming conventions for identified duplicate columns (currently [duplicate_col].[table_name]).
Checklist
The issue is atomic
The issue description is documented
The issue title describes the problem succinctly
Developers are assigned to the issue
Labels are assigned to the issue
The text was updated successfully, but these errors were encountered:
Discussed an alternative, higher-level API for this using the existing extract_tibble() function. The following would return a single tibble with demographics and disease_response instruments joined together appropriately.
One question is what "appropriately" means. Another question is how to make this syntax concise and expressive while at the same time not limiting flexibility. We will see use cases for table joins during development of the Prodigy reporter and aim to implement a solution with 0.3 in a few months.
I had one more thought, not sure if it's possible or even a good idea. What if
supertbl|>
extract_tibble(everything())
returns a tibble that's (mostly) the same as the block matrix? The use case here might be that people could make changes inside the supertibble and then send those changes back to the REDCap instance. I know I said we don't want to touch writing, but it's a thought. And this could guide how we plan what a structure of a table in which nonrepeating and repeating instruments are combined.
skadauke
changed the title
[FEATURE] extract_tibbles should allow users to join specified tables
[FEATURE] extract_tibble should allow users to join specified tables
Dec 16, 2022
Feature Request Description
In addition to extracting selected
tibble
s,extract_tibbles
should allow users the option to join them as a singulartibble
output instead of the list. As found in recent projects, the next logical step often times when usingextract_tibbles
is joining.Proposed Solution
Prototyped logic is available in our internal Prodigy Reporter. The new argument (suggest:
join_tibbles = TRUE/FALSE
) should kick off join operations. Since we abstract some column names, i.e.form_status_complete
, we need to account for duplicated colnames in thetibble
s themselves.Here's how I envision this being implemented, but imagine the external functions as internal to
extract_tibbles
instead:You should be able to copy and paste all of this into a script and use
REDCapTidieR
0.2.0 to view the proposed output. Open to suggestions on naming conventions for identified duplicate columns (currently[duplicate_col].[table_name]
).Checklist
The text was updated successfully, but these errors were encountered: