Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adds read mult file support to all delims, parquet #14

Merged
merged 1 commit into from
Jul 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "TidierFiles"
uuid = "8ae5e7a9-bdd3-4c93-9cc3-9df4d5d947db"
authors = ["Daniel Rizk <[email protected]> and contributors"]
version = "0.1.2"
version = "0.1.3"

[deps]
Arrow = "69666777-d1a9-59fb-9406-91d4454c9d45"
Expand Down
17 changes: 16 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,6 @@ The path can be a file available either locally or on the web.
```julia
read_csv("https://raw.githubusercontent.com/TidierOrg/TidierFiles.jl/main/testing_files/csvtest.csv", skip = 2, n_max = 3, col_select = ["ID", "Score"], missingstring = ["4"])
```

```
3×2 DataFrame
Row │ ID Score
Expand All @@ -80,4 +79,20 @@ read_csv("https://raw.githubusercontent.com/TidierOrg/TidierFiles.jl/main/testin
1 │ 3 77
2 │ missing 85
3 │ 5 95
```

Read multiple files by passing paths as a vector.
```
path = "https://raw.githubusercontent.com/TidierOrg/TidierFiles.jl/main/testing_files/csvtest.csv"
read_csv([path, path], skip=3)
```
```
4×3 DataFrame
Row │ ID Name Score
│ Int64 String7 Int64
─────┼───────────────────────
1 │ 4 David 85
2 │ 5 Eva 95
3 │ 4 David 85
4 │ 5 Eva 95
```
2 changes: 1 addition & 1 deletion docs/examples/UserGuide/delim.jl
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ read_csv("https://raw.githubusercontent.com/TidierOrg/TidierFiles.jl/main/testin

#These functions read a delimited file (CSV, TSV, or custom delimiter) into a DataFrame. The arguments are:

# - `file`: Path to the file or a URL.
# - `file`: Path or vector of paths to the file(s) or a URL(s).
# - `delim`: Field delimiter. Default is ',' for `read_csv`, '\t' for `read_tsv` and `read_delim`.
# - `col_names`: Use first row as column names. Can be `true`, `false`, or an array of strings. Default is `true`.
# - `skip`: Number of lines to skip before reading data. Default is 0.
Expand Down
2 changes: 1 addition & 1 deletion docs/examples/UserGuide/parquet.jl
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

# This function reads a Parquet (.parquet) file into a DataFrame. The arguments are:

# - `path`: The path to the .parquet file.
# - `path`: The path or vector of paths or URLs to the .parquet file.
# - `col_names`: Indicates if the first row of the file is used as column names. Default is `true`.
# - `skip`: Number of initial rows to skip before reading data. Default is 0.
# - `n_max`: Maximum number of rows to read. Default is `Inf` (read all rows).
Expand Down
17 changes: 16 additions & 1 deletion docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,6 @@ The path can be a file available either locally or on the web.
```julia
read_csv("https://raw.githubusercontent.com/TidierOrg/TidierFiles.jl/main/testing_files/csvtest.csv", skip = 2, n_max = 3, col_select = ["ID", "Score"], missingstring = ["4"])
```

```
3×2 DataFrame
Row │ ID Score
Expand All @@ -77,4 +76,20 @@ read_csv("https://raw.githubusercontent.com/TidierOrg/TidierFiles.jl/main/testin
1 │ 3 77
2 │ missing 85
3 │ 5 95
```

Read multiple files by passing paths as a vector.
```
path = "https://raw.githubusercontent.com/TidierOrg/TidierFiles.jl/main/testing_files/csvtest.csv"
read_csv([path, path], skip=3)
```
```
4×3 DataFrame
Row │ ID Name Score
│ Int64 String7 Int64
─────┼───────────────────────
1 │ 4 David 85
2 │ 5 Eva 95
3 │ 4 David 85
4 │ 5 Eva 95
```
Loading
Loading