-
Notifications
You must be signed in to change notification settings - Fork 418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
separate_wider_delim
renames original column when col_remove=FALSE
and names=
not specified
#1499
Comments
separate_wider_delim
renames original column when col_remove=TRUE
and names=
not specifiedseparate_wider_delim
renames original column when col_remove=FALSE
and names=
not specified
Somewhat more minimal reprex: library(tidyverse)
library(tidyr)
library(reprex)
df <- tibble(x = c('a;b', 'c', NA, 'd;e;f', 'g;h'))
names(separate_wider_delim(
df, x, delim = ';', too_few = 'align_start',
names_sep = '_',
cols_remove = FALSE
))
#> [1] "x_1" "x_2" "x_3" "x_x" Created on 2023-11-01 with reprex v2.0.2 |
The difficulty seems to be that One solution is modifying rename_with_names_sep <- function(x, outer, names_sep, keep_rep_outer) {
inner <- names(x)
names <- apply_names_sep(outer, inner, names_sep)
if (!keep_rep_outer) {
names[names == paste0(outer, names_sep, outer)] <- outer
}
set_names(x, names)
} An alternative not pretty solution: map_unpack <- function(data, cols, fun, names_sep, names_repair, error_call = caller_env()) {
cols <- tidyselect::eval_select(
enquo(cols),
data = data,
allow_rename = FALSE,
allow_empty = FALSE,
error_call = error_call
)
col_names <- names(cols)
ori_cols <- data[, col_names, drop = FALSE]
for (col in col_names) {
data[[col]] <- fun(data[[col]], col)
cols_remove = !(col %in% colnames(data[[col]]))
data[[col]][[col]] <- NULL
}
unpacked <- unpack(
data = data,
cols = all_of(col_names),
names_sep = names_sep,
names_repair = names_repair,
error_call = error_call
)
if (!cols_remove) {
ori_index <- match(col_names, colnames(data))
new_index <- ori_index + cumsum(lengths(data[, col_names, drop = FALSE]))
for (i in seq_along(ori_cols)) {
unpacked <- tibble::add_column(
unpacked,
ori_cols[, i, drop = FALSE],
.before = new_index[i])
}
}
unpacked
} |
From #1539 library(tidyr)
separate_wider_delim(
tibble(a="x y"),
cols=a,
delim=" ",
names_sep="",
cols_remove=FALSE
)
#> # A tibble: 1 × 3
#> a1 a2 aa
#> <chr> <chr> <chr>
#> 1 x y x y Created on 2024-10-24 with reprex v2.1.1 |
When splitting a delimited character variable using the newer
separate_wider_delim()
function from thetidyr
package (v 1.3.0), if you:names_sep=
argument,names=
argument, andcols_remove=FALSE
,then the original variable is retained in the output data set (as expected) but:
names_sep=
argument such that, e.g.,names_sep='_'
withcols=varname
produces a variable namedvarname_varname
in the output data, andseparate()
function behaves (placing the original column before the new columns).Note that the first point above (variable renaming) is the major issue. The second point is just something that I was not unexpecting.
Created on 2023-05-14 with reprex v2.0.2
Session info
The text was updated successfully, but these errors were encountered: