Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates from full package review #43

Merged
merged 14 commits into from
Oct 2, 2024
6 changes: 4 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,10 @@ Package: datatagr
Title: Generic Data Labelling and Validating
Version: 0.0.1
Authors@R: c(
person("Chris", "Hartgerink", , "[email protected]", role = "cre",
comment = c(ORCID = "0000-0003-1050-6809"))
person("Chris", "Hartgerink", , "[email protected]", role = c("cre", "aut"),
comment = c(ORCID = "0000-0003-1050-6809")),
person("Hugo", "Gruson", , "[email protected]", role = "rev",
comment = c(ORCID = "0000-0002-4094-1476"))
)
Description: Provides tools to help label and validate data according to user-specified rules. The 'datatagr' class adds variable level attributes to 'data.frame' columns. Once labelled, these variables can be seamlessly used in downstream analyses, making data pipelines clearer, more robust, and more reliable.
License: MIT + file LICENSE
Expand Down
4 changes: 2 additions & 2 deletions R/datatagr-package.R
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#' Base Tools for Labelling and Validating Data
#'
#' The *datatagr* package provides tools to help label and validate data. The
#' 'datatagr' class adds column level attributes to a 'data.frame'.
#' The \pkg{datatagr} package provides tools to help label and validate data.
#' The 'datatagr' class adds column level attributes to a 'data.frame'.
#' Once labelled, variables can be seamlessly used in downstream analyses,
#' making data pipelines more robust and reliable.
#'
Expand Down
4 changes: 4 additions & 0 deletions R/label_variables.R
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,10 @@ label_variables <- function(x, labels) {
for (name in names(labels)) {
label_value <- unlist(labels[names(labels) == name])

# We use the `label` attribute to store the label
# We use `ifelse` to handle the case where the label is set to `NULL`
# This is because `as.character(NULL)` returns `character(0)`
# attr(x[[name]], "label") <- NULL does not have the desired result
attr(x[[name]], "label") <- ifelse(is.null(label_value),
"",
as.character(label_value)
Expand Down
4 changes: 1 addition & 3 deletions R/labels_df.R
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,8 @@ labels_df <- function(x) {
labels <- unlist(labels(x))
out <- drop_datatagr(x)

# Find the intersection of names(out) and names(labels)
common_names <- intersect(names(out), names(labels))
# Replace the names of out that are in intersection with corresponding labels
names(out)[match(common_names, names(out))] <- labels[common_names]
names(out)[match(names(labels), names(out))] <- labels[names(labels)]

out
}
6 changes: 1 addition & 5 deletions R/lost_labels.R
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,7 @@ lost_labels <- function(old, new, lost_action) {
if (lost_action != "none" && length(lost_vars) > 0) {
lost_labels <- lapply(lost_vars, function(label) old[[label]])

lost_msg <- paste(lost_vars,
lost_labels,
sep = " - ",
collapse = ", "
)
lost_msg <- vars_labels(lost_vars, lost_labels)
msg <- paste(
"The following labelled variables are lost:\n",
lost_msg
Expand Down
7 changes: 4 additions & 3 deletions R/print.datatagr.R
Original file line number Diff line number Diff line change
Expand Up @@ -38,12 +38,13 @@ print.datatagr <- function(x, ...) {
label_names <- names(label_values)

# Construct the labels_txt string from the filtered pairs
labels_txt <- paste(label_names, label_values, sep = "-", collapse = ", ")
labels_txt <- vars_labels(label_names, label_values)

if (labels_txt == "") {
labels_txt <- "[no labelled variables]"
cat("\n[no labelled variables]\n")
} else {
cat("\nlabelled variables:\n", labels_txt, "\n")
}
cat("\nlabels:", labels_txt, "\n")

invisible(x)
}
File renamed without changes.
6 changes: 1 addition & 5 deletions R/restore_labels.R
Original file line number Diff line number Diff line change
Expand Up @@ -38,11 +38,7 @@ restore_labels <- function(x, newLabels,
if (lost_action != "none" && length(lost_vars) > 0) {
lost_labels <- lapply(lost_vars, function(label) newLabels[[label]])

lost_msg <- paste(lost_vars,
lost_labels,
sep = " - ",
collapse = ", "
)
lost_msg <- vars_labels(lost_vars, lost_labels)
msg <- paste(
"The following labelled variables are lost:\n",
lost_msg
Expand Down
2 changes: 1 addition & 1 deletion R/validate_labels.R
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ validate_labels <- function(x) {

if (is.null(unlist(x_labels))) stop("`x` has no labels")

# check that x is a list, and each label is a `character`
# check that x_labels is a list, and each label is a `character`
checkmate::assert_list(x_labels, types = c("character", "null"))

x
Expand Down
3 changes: 1 addition & 2 deletions R/validate_types.R
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,7 @@
validate_types <- function(x, ...) {
checkmate::assert_class(x, "datatagr")
types <- rlang::list2(...)
checkmate::assert_list(types, min.len = 1)
checkmate::assert_list(types, types = "character")
checkmate::assert_list(types, min.len = 1, types = "character")

vars_to_check <- intersect(names(x), names(types))

Expand Down
11 changes: 11 additions & 0 deletions R/vars_labels.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#' Internal printing function for variables and labels
#'
#' @param vars a `character` vector of variable names
#' @param labels a `character` vector of labels
vars_labels <- function(vars, labels) {
paste(vars,
labels,
sep = " - ",
collapse = "\n "
)
}
3 changes: 2 additions & 1 deletion README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -76,8 +76,8 @@ To make it easier for us to evaluate your contribution, please run the following

```r
styler::style_pkg()
spelling::update_wordlist(pkg = ".", vignettes = TRUE)
devtools::document()
spelling::update_wordlist(pkg = ".", vignettes = TRUE)

lintr::lint_package()

Expand All @@ -94,6 +94,7 @@ This will reduce the time it takes for us to review your contribution. Thank you

This project is related to other existing projects in R or other languages, but also differs from them in the following aspects:

- [labelled](https://github.com/larmarange/labelled/): A package for labelling data in R, but it is more focused on labelling variables than validating them.
- [linelist](https://github.com/epiverse-trace/linelist): A package for managing and validating linelist data - the original inspiration for datatagr.

### Code of Conduct
Expand Down
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,8 +81,8 @@ is consistent with the rest of the package:

``` r
styler::style_pkg()
spelling::update_wordlist(pkg = ".", vignettes = TRUE)
devtools::document()
spelling::update_wordlist(pkg = ".", vignettes = TRUE)

lintr::lint_package()

Expand All @@ -100,6 +100,9 @@ Thank you! 😊
This project is related to other existing projects in R or other
languages, but also differs from them in the following aspects:

- [labelled](https://github.com/larmarange/labelled/): A package for
labelling data in R, but it is more focused on labelling variables
than validating them.
- [linelist](https://github.com/epiverse-trace/linelist): A package for
managing and validating linelist data - the original inspiration for
datatagr.
Expand Down
5 changes: 5 additions & 0 deletions inst/WORDLIST
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,12 @@ ORCID
RECON
Unlabeled
dplyr
leverspeed
lifecycle
linelist
messspeeds
rlang
tibble
tidyselect
tidyverse
unclassing
9 changes: 7 additions & 2 deletions man/datatagr-package.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

16 changes: 16 additions & 0 deletions man/vars_labels.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion tests/testthat/_snaps/compat-dplyr.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
# Compatibility with dplyr::transmute()

The following labelled variables are lost:
speed - Miles per hour, dist - Distance in miles
speed - Miles per hour
dist - Distance in miles

# Compatibility with dplyr::mutate(.keep)

Expand Down
6 changes: 4 additions & 2 deletions tests/testthat/_snaps/print.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,9 @@
49 24 120
50 25 85

labels: speed-Miles per hour, dist-Distance in miles
labelled variables:
speed - Miles per hour
dist - Distance in miles

---

Expand Down Expand Up @@ -112,5 +114,5 @@
49 24 120
50 25 85

labels: [no labelled variables]
[no labelled variables]

2 changes: 1 addition & 1 deletion tests/testthat/test-square_bracket.R
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ test_that("tests for [ operator", {
expect_error(x[, 1], msg)

lost_labels_action("warning", quiet = TRUE)
msg <- "The following labelled variables are lost:\n speed - Miles per hour, dist - Distance in miles"
msg <- "The following labelled variables are lost:\n speed - Miles per hour\n dist - Distance in miles"
expect_warning(x[, NULL], msg)

# functionalities
Expand Down
Loading