Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

col_types missing from read_csv() but present in docs #21

Closed
KadeG opened this issue Nov 8, 2024 · 5 comments
Closed

col_types missing from read_csv() but present in docs #21

KadeG opened this issue Nov 8, 2024 · 5 comments

Comments

@KadeG
Copy link

KadeG commented Nov 8, 2024

col_types appears to be missing from read_csv() but present in the docs: https://tidierorg.github.io/TidierFiles.jl/latest/reference/#TidierFiles.read_csv-Tuple{Any}

Tangential: it would be awesome if we can pass through some of the other options present in CSV.read like normalizenames, etc.

@drizk1
Copy link
Member

drizk1 commented Nov 8, 2024

Hmm idk what happened there but we can def add support for both.

Just wondering when using r, what for your ideal col_types use look like? It's not one I frequently have used

@KadeG
Copy link
Author

KadeG commented Nov 8, 2024

I've actually never used Tidier in R, only Tidier.jl because it's awesome. But I frequently use col_types when dealing with data with leading zeros: it gets interpreted as Int which strips the leading zeros so you can't fix it after reading it in. Should be able to pass to CSV.File: https://csv.juliadata.org/stable/examples.html#types_example

Would it be easier to have a helper function to interpret passing through the various extra arguments CSV.File can take?

@drizk1
Copy link
Member

drizk1 commented Nov 8, 2024

Oh wow, I love that.

So basically, if I'm understanding correctly, there are times when you need to specify the column type you are reading for a select couple columns?

It might be easier to have a helper function.

I'm open to any approach really.

If you have a plan in mind, feel free to give it a spin, if not i can piece something together

@drizk1
Copy link
Member

drizk1 commented Nov 12, 2024

alright so, if you add this branch below you can determine the column types and use any arg that already exists for CSV.jl in these a well, including normalizenames. I have to a do a bit of docs fixes and mb one or two other things before release, but it should be good to go if you want to try it.

using Pkg; Pkg.add(url = "https://github.com/TidierOrg/TidierFiles.jl", rev = "add_typesandallargs"
#col_types is a simple wrapper for types
read_csv(mtcars_path, col_types = Dict(:hp => Float64, :vs =>String), normalizenames = false)

#this demonstrates using the underlying arg names of `types`
read_csv(mtcars_path, types = Dict(:hp => Float64, :vs =>String))

@drizk1
Copy link
Member

drizk1 commented Nov 13, 2024

Alright, with v.1.6 all the features you requested should be available.

I'm going to close this issue for now but plz reopen / open a new issue if there's anything not working / something else you need.

@drizk1 drizk1 closed this as completed Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants