Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import linelist code, start update to datatagr #6

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -55,3 +55,6 @@ rsconnect/
# macOS hidden files
.DS_Store
inst/doc

# Random files to ignore
tests/testthat/Rplots.pdf
20 changes: 17 additions & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,28 @@ Authors@R: c(
)
Description: Provides tools to help tag and validate data according to user-specified rules. The 'datatagr' class adds a custom tagging system to classical 'data.frame' objects to identify key data. Once tagged, these variables can be seamlessly used in downstream analyses, making data pipelines more robust and reliable.
License: MIT + file LICENSE
Suggests:
Depends:
R (>= 3.1.0)
Imports:
checkmate,
dplyr,
lifecycle,
rlang,
tidyselect
Suggests:
callr,
knitr,
magrittr,
outbreaks,
rmarkdown,
spelling,
testthat (>= 3.0.0)
testthat,
tibble
Config/Needs/website:
r-lib/pkgdown,
epiverse-trace/epiversetheme
VignetteBuilder:
knitr
Config/Needs/website: epiverse-trace/epiversetheme
Config/testthat/edition: 3
Config/testthat/parallel: true
Encoding: UTF-8
Expand Down
25 changes: 25 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,2 +1,27 @@
# Generated by roxygen2: do not edit by hand

S3method("$<-",linelist)
S3method("[",linelist)
S3method("[<-",linelist)
S3method("[[<-",linelist)
S3method("names<-",linelist)
S3method(print,linelist)
S3method(select,linelist)
export(get_lost_tags_action)
export(has_tag)
export(lost_tags_action)
export(make_linelist)
export(select_tags)
export(set_tags)
export(tags)
export(tags_defaults)
export(tags_df)
export(tags_names)
export(tags_types)
export(validate_linelist)
export(validate_tags)
export(validate_types)
importFrom(dplyr,select)
importFrom(lifecycle,deprecated)
importFrom(stats,setNames)
importFrom(utils,modifyList)
22 changes: 22 additions & 0 deletions R/custom_assertions.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Built following instructions from
# https://mllg.github.io/checkmate/articles/checkmate.html#extending-checkmate
check_not_data_table <- function(x) {
if (inherits(x, "data.table")) {
return("must NOT be a data.table")
}
return(TRUE)
}

assert_not_data_table <- function(
x,
.var.name = checkmate::vname(x),
add = NULL) {
if (missing(x)) {
stop(sprintf(
"argument \"%s\" is missing, with no default",
.var.name
))
}
res <- check_not_data_table(x)
checkmate::makeAssertion(x, res, .var.name, add)
}
121 changes: 121 additions & 0 deletions R/datatagr-package.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
#' Base Tools for Tagging and Validating Data
#'
#' The *datatagr* package provides tools to help tag and validate data. The
#' 'datatagr' class adds a custom tagging system to classical 'data.frame'
#' objects to identify key data. Once tagged, these variables can be seamlessly
#' used in downstream analyses, making data pipelines more robust and reliable.
#'
#' @aliases datatagr
#'
#' @section Main functions:
#'
#' * [make_linelist()]: to create `datatagr` objects from a `data.frame` or a
#' `tibble`
#'
#' * [set_tags()]: to change or add tagged variables in a `datatagr`
#'
#' * [tags()]: to get the list of tags of a `datatagr`
#'
#' * [tags_df()]: to get a `data.frame` of all tagged variables
#'
#' * [lost_tags_action()]: to change the behaviour of actions where tagged
#' variables are lost (e.g. removing columns storing tagged variables) to
#' issue warnings, errors, or do nothing
#'
#' * [get_lost_tags_action()]: to check the current behaviour of actions where
#' tagged variables are lost
#'
#' @section Dedicated methods:
#'
#' Specific methods commonly used to handle `data.frame` are provided for
#' `datatagr` objects, typically to help flag or prevent actions which could
#' alter or lose tagged variables (and may thus break downstream data
#' pipelines).
#'
#' * `names() <-` (and related functions, such as [dplyr::rename()]) will
#' rename tags as needed
#'
#' * `x[...] <-` and `x[[...]] <-` (see [sub_linelist]): will adopt the
#' desired behaviour when tagged variables are lost
#'
#' * `print()`: prints info about the `datatagr` in addition to the
#' `data.frame` or `tibble`
#'
#' @examples
#'
#' if (require(outbreaks)) {
#' # using base R style
#'
#' ## dataset we'll create a linelist from, only using the first 50 entries
#' measles_hagelloch_1861[1:50, ]
#'
#' ## create linelist
#' x <- make_linelist(measles_hagelloch_1861[1:50, ],
#' id = "case_ID",
#' date_onset = "date_of_prodrome",
#' age = "age",
#' gender = "gender"
#' )
#' x
#'
#' ## check tagged variables
#' tags(x)
#'
#' ## robust renaming
#' names(x)[1] <- "identifier"
#' x
#'
#' ## example of dropping tags by mistake - default: warning
#' x[, 2:5]
#'
#' ## to silence warnings when taggs are dropped
#' lost_tags_action("none")
#' x[, 2:5]
#'
#' ## to trigger errors when taggs are dropped
#' # lost_tags_action("error")
#' # x[, 2:5]
#'
#' ## reset default behaviour
#' lost_tags_action()
#'
#'
#' # using tidyverse style
#'
#' ## example of creating a linelist, adding a new variable, and adding a tag
#' ## for it
#'
#' if (require(dplyr) && require(magrittr)) {
#' x <- measles_hagelloch_1861 %>%
#' tibble() %>%
#' make_linelist(
#' id = "case_ID",
#' date_onset = "date_of_prodrome",
#' age = "age",
#' gender = "gender"
#' ) %>%
#' mutate(result = if_else(is.na(date_of_death), "survived", "died")) %>%
#' set_tags(outcome = "result") %>%
#' rename(identifier = case_ID)
#'
#' head(x)
#'
#' ## extract tagged variables
#' x %>%
#' select(has_tag(c("gender", "age")))
#'
#' x %>%
#' tags()
#'
#' x %>%
#' select(starts_with("date"))
#' }
#' }
#'
#' @keywords internal
"_PACKAGE"

## usethis namespace: start
#' @importFrom lifecycle deprecated
## usethis namespace: end
NULL
2 changes: 0 additions & 2 deletions R/dev-utils.R
Original file line number Diff line number Diff line change
@@ -1,11 +1,9 @@
# This unexported function adds a custom item to `usethis::use_release_issue()`
release_bullets <- function() {

c(
"Run `goodpractice::gp()`",
"Review [WORDLIST](https://docs.cran.dev/spelling#wordlist)",
"Check if `# nolint` comments are still needed with recent lintr releases",
"All contributors to this release are acknowledged in some way"
)

}
24 changes: 24 additions & 0 deletions R/drop_linelist.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
#' Remove the linelist class from an object
#'
#' Internal function. Used for dispatching to other methods when `NextMethod` is
#' an issue (typically to pass additional arguments to the `linelist` method).
#'
#' @param x a `linelist` object
#'
#' @param remove_tags a `logical` indicating if tags should be removed from the
#' attributes; defaults to `TRUE`
#'
#' @noRd
#'
#' @return The function returns the same object without the `linelist` class.
#'
#'

drop_linelist <- function(x, remove_tags = TRUE) {
classes <- class(x)
class(x) <- setdiff(classes, "linelist")
if (remove_tags) {
attr(x, "tags") <- NULL
}
x
}
37 changes: 37 additions & 0 deletions R/has_tag.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
#' A selector function to use in \pkg{tidyverse} functions
#'
#' @param tags A character vector of tags listing the variables you want to
#' operate on
#'
#' @returns A numeric vector containing the position of the columns with the
#' requested tags
#'
#' @export
#'
#' @examples
#' if (require(outbreaks) && require(dplyr)) {
#' ## dataset we'll create a linelist from
#' measles_hagelloch_1861
#'
#' ## create linelist
#' x <- make_linelist(measles_hagelloch_1861,
#' id = "case_ID",
#' date_onset = "date_of_prodrome",
#' age = "age",
#' gender = "gender"
#' )
#' head(x)
#'
#' x %>%
#' select(has_tag(c("id", "age"))) %>%
#' head()
#' }
has_tag <- function(
tags) {
dat <- tidyselect::peek_data(fn = "has_tag")
dat_tags <- tags(dat)

cols_to_extract <- dat_tags[names(dat_tags) %in% tags]

which(colnames(dat) %in% cols_to_extract)
}
78 changes: 78 additions & 0 deletions R/lost_tags_action.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
#' Check and set behaviour for lost tags
#'
#' This function determines the behaviour to adopt when tagged variables of a
#' `linelist` are lost e.g. through subsetting. This is achieved using `options`
#' defined for the `linelist` package.
#'
#' @param action a `character` indicating the behaviour to adopt when tagged
#' variables have been lost: "error" (default) will issue an error; "warning"
#' will issue a warning; "none" will do nothing
#'
#' @param quiet a `logical` indicating if a message should be displayed; only
#' used outside pipelines
#'
#' @param x deprecated
#'
#' @return returns `NULL`; the option itself is set in `options("linelist")`
#'
#' @details The errors or warnings generated by linelist in case of tagged
#' variable loss has a custom class of `linelist_error` and `linelist_warning`
#' respectively.
#'
#' @export
#'
#' @rdname lost_tags_action
#'
#' @aliases lost_tags_action get_lost_tags_action
#'
#' @examples
#' # reset default - done automatically at package loading
#' lost_tags_action()
#'
#' # check current value
#' get_lost_tags_action()
#'
#' # change to issue errors when tags are lost
#' lost_tags_action("error")
#' get_lost_tags_action()
#'
#' # change to ignore when tags are lost
#' lost_tags_action("none")
#' get_lost_tags_action()
#'
#' # reset to default: warning
#' lost_tags_action()
#'
lost_tags_action <- function(action = c("warning", "error", "none"),
quiet = FALSE,
x) {
if (!missing(x) || inherits(action, "linelist")) {
stop(
"Using `lost_tags_action()` in a pipeline is deprecated",
call. = FALSE
)
}

linelist_options <- options("linelist")$linelist # nolint

action <- match.arg(action)
linelist_options$lost_tags_action <- action
options(linelist = linelist_options) # nolint
if (!quiet) {
if (action == "warning") msg <- "Lost tags will now issue a warning."
if (action == "error") msg <- "Lost tags will now issue an error."
if (action == "none") msg <- "Lost tags will now be ignored."
message(msg)
}
return(invisible(NULL))
}



#' @export
#'
#' @rdname lost_tags_action

get_lost_tags_action <- function() {
options("linelist")$linelist$lost_tags_action # nolint
}
Loading
Loading