Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor expand_model_out_grid for readability #147

Open
zkamvar opened this issue Oct 30, 2024 · 0 comments
Open

Refactor expand_model_out_grid for readability #147

zkamvar opened this issue Oct 30, 2024 · 0 comments
Assignees
Labels
upkeep maintenance, infrastructure, and similar

Comments

@zkamvar
Copy link
Member

zkamvar commented Oct 30, 2024

I've found myself digging into the expand_model_out_grid() function and getting stuck on more than one occasion. The places I have found myself stuck in are nested mapping function calls combined with pipe operators without comments. I don't have a problem with pipes or nesting per se, but these fill up my working memory and if I'm trying to debug something like #123, it takes me a long time to get to the correct conclusion.

This is an important function, and it's done a lot of heavy lifting. Before we move over to schemas version 4.0, I would propose to give it a polish and comments so that it becomes easier to navigate.

For example, two multi-line statements were required for expand_output_type_grid(), but they took me a good amount of time to figure out what was happening in each. Instead, they could be stand-alone functions called get_task_id_list() and get_output_type_list() with comments about their operations:

task_id_l <- purrr::map(
round_config[["model_tasks"]],
~ .x[["task_ids"]] %>%
derived_taskids_to_na(derived_task_ids) %>%
null_taskids_to_na()
) %>%
# Fix round_id value to current round_id in round_id variable column
fix_round_id(
round_id = round_id,
round_config = round_config,
round_ids = hubUtils::get_round_ids(config_tasks)
) %>%
process_grid_inputs(required_vals_only = required_vals_only)

output_type_l <- purrr::map(
round_config[["model_tasks"]],
function(.x) {
out <- .x[["output_type"]]
if (is.null(output_types)) {
out
} else {
mt_output_types <- output_types[output_types %in% names(out)]
out[mt_output_types]
}
}
) %>%
purrr::map(
~ extract_mt_output_type_ids(.x, config_tid)
) %>%
process_grid_inputs(required_vals_only = required_vals_only) %>%
purrr::map(function(.x) {
purrr::compact(.x)
})

The helper functions themselves could be re-factored as well with comments indicating what's happening. Continuing the thread, when I encountered expand_output_type_grid(), I was already pretty deep, but I was not really understanding what this was doing because it had pipe operators nested within a mapping function that was further piped to another function.

purrr::imap(
output_type_values,
~ c(task_id_values, list(
output_type = .y,
output_type_id = .x
)) %>%
purrr::compact() %>%
expand.grid(stringsAsFactors = FALSE)
) %>%
purrr::list_rbind()
}

If I were to refactor this for readability, I would do something like:

  # generate expanded grids for each output type with the task IDs 
  output_type_grid_list <- purrr::imap(output_type_values, output_type_grid, task_id_values = task_id_values)
  # combine them into a single grid
  purrr::list_rbind(output_type_grid_list)
}

#' Create a grid for a single output type
#'
#' @param output_type a character indicating the output type
#' @param output_type_id a character vector of the corresponding output_type_id or NULL
#' @param task_id_values a list of task IDs
#' @noRd
output_type_grid <- function(output_type, output_type_id, task_id_values) {
  # generate a list, remove the NULL elements, and then use `expand.grid()` to create
  # a table with all combinations of of the output type IDs and task IDs  
  c(task_id_values, list(output_type = output_type, output_type_id = output_type_id)) %>%
    purrr::compact() %>%
    expand.grid(stringsAsFactors = FALSE)
}

Again, I think this is an important function and it has had a lot of hard work put into it by @annakrystalli to cover all the weirdness that arises from the schemas and backwards compatibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
upkeep maintenance, infrastructure, and similar
Projects
Status: In Progress
Development

No branches or pull requests

2 participants