Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*_join syntax inconsistency? #126

Open
rdboyes opened this issue Dec 2, 2024 · 1 comment
Open

*_join syntax inconsistency? #126

rdboyes opened this issue Dec 2, 2024 · 1 comment

Comments

@rdboyes
Copy link
Member

rdboyes commented Dec 2, 2024

When you do a join in TidierData on two columns that do not have the same name, the syntax we use is e.g.:

@left_join(d1, d2, "c2" = "c1")

Does anyone remember why we chose this syntax? It's not the same as DataFrames leftjoin (leftjoin(d1, d2, on = :c1 => :c2) and it's also not the same as joins in R (left_join(d1, d2, by = c("c1" = "c2")) or join_by(c1 == c2)).

I fully recognize that this may be my own fault since I remember doing some of the initial work on the joins, but I think we should fix this to be consistent with R, probably going with the join_by version since that is the most up-to-date syntax.

@drizk1
Copy link
Member

drizk1 commented Dec 17, 2024

I was still learning tidyverse when joins were implemented so im not sure.

that being said, with ~ 10 more lines in the parse_join_by function, i was able to add support for the following without breaking any code and im sure theres a better way to do it
ex df from df.jl docs

@inner_join(a, b, join_by(City==Location, Job==Work))
@inner_join(a, b, join_by("City"=="Location", "Job"=="Work"))
@inner_join(a, b, join_by(:City=>:Location))

@drizk1 drizk1 mentioned this issue Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants