diff --git a/_freeze/posts/2023-10-08_datetimes-openxlsx/index/execute-results/html.json b/_freeze/posts/2023-10-08_datetimes-openxlsx/index/execute-results/html.json index 082ce14..6d91cb8 100644 --- a/_freeze/posts/2023-10-08_datetimes-openxlsx/index/execute-results/html.json +++ b/_freeze/posts/2023-10-08_datetimes-openxlsx/index/execute-results/html.json @@ -1,7 +1,7 @@ { - "hash": "6caedeee311271dccef76778ccccac07", + "hash": "0336b99906c453f098aa7254d0eb64d1", "result": { - "markdown": "---\ntitle: \"Detect date and time variables with openxlsx\"\nauthor: \"Layal C. Lettry\"\ndate: \"2023-10-08\"\ncategories: [code, openxlsx, date, datetime]\nimage: \"image.jpg\"\n---\n\n\n# Detect date variables\n\nWhen you try to read an excel file, the dates don't always look the way you would expect. You may see a vector of integers (or doubles) rather than a vector of dates. If you are using [openxlsx](https://github.com/ycphs/openxlsx), you can set `detectDates = TRUE` in the function `read.xlsx()`.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(openxlsx)\nlibrary(tidyverse)\nlibrary(readxl)\n```\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\nxlsxfile_path <- system.file(\"extdata\", \"readTest.xlsx\", package = \"openxlsx\")\n\n# Vector of doubles instead of dates\nxlsxfile_with_problems <- read.xlsx(xlsxfile_path, sheet = 3) |> \n as_tibble()\nxlsxfile_with_problems\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n# A tibble: 2,083 × 5\n Date value word bool wordZ2 \n \n 1 41757 0.839 N-U-B-R-A FALSE FALSE-Z\n 2 41756 0.886 N-Z-P-S-Y TRUE TRUE-Z \n 3 41755 0.574 C-G-D-X-H TRUE TRUE-Z \n 4 41754 0.137 FALSE FALSE-Z\n 5 41753 0.369 B-K-A-O-W TRUE TRUE-Z \n 6 41752 NA H-P-G-O-K TRUE TRUE-Z \n 7 41751 0.842 F-P-C-L-T TRUE TRUE-Z \n 8 41750 0.227 A-N-Q-P-V TRUE TRUE-Z \n 9 41749 0.276 Y-E-B-K-O TRUE TRUE-Z \n10 41748 0.419 V-S-N-T-R TRUE TRUE-Z \n# ℹ 2,073 more rows\n```\n:::\n\n```{.r .cell-code}\nglimpse(xlsxfile_with_problems)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nRows: 2,083\nColumns: 5\n$ Date 41757, 41756, 41755, 41754, 41753, 41752, 41751, 41750, 41749, …\n$ value 0.839076400, 0.886380000, 0.574131400, 0.136606500, 0.369258200…\n$ word \"N-U-B-R-A\", \"N-Z-P-S-Y\", \"C-G-D-X-H\", NA, \"B-K-A-O-W\", \"H-P-G-…\n$ bool FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, T…\n$ wordZ2 \"FALSE-Z\", \"TRUE-Z\", \"TRUE-Z\", \"FALSE-Z\", \"TRUE-Z\", \"TRUE-Z\", \"…\n```\n:::\n\n```{.r .cell-code}\n# Vector of dates\nxlsxfile <- read.xlsx(xlsxfile_path, sheet = 3, detectDates = TRUE) |> \n as_tibble()\nxlsxfile\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n# A tibble: 2,083 × 5\n Date value word bool wordZ2 \n \n 1 2014-04-28 0.839 N-U-B-R-A FALSE FALSE-Z\n 2 2014-04-27 0.886 N-Z-P-S-Y TRUE TRUE-Z \n 3 2014-04-26 0.574 C-G-D-X-H TRUE TRUE-Z \n 4 2014-04-25 0.137 FALSE FALSE-Z\n 5 2014-04-24 0.369 B-K-A-O-W TRUE TRUE-Z \n 6 2014-04-23 NA H-P-G-O-K TRUE TRUE-Z \n 7 2014-04-22 0.842 F-P-C-L-T TRUE TRUE-Z \n 8 2014-04-21 0.227 A-N-Q-P-V TRUE TRUE-Z \n 9 2014-04-20 0.276 Y-E-B-K-O TRUE TRUE-Z \n10 2014-04-19 0.419 V-S-N-T-R TRUE TRUE-Z \n# ℹ 2,073 more rows\n```\n:::\n\n```{.r .cell-code}\nglimpse(xlsxfile)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nRows: 2,083\nColumns: 5\n$ Date 2014-04-28, 2014-04-27, 2014-04-26, 2014-04-25, 2014-04-24, 20…\n$ value 0.839076400, 0.886380000, 0.574131400, 0.136606500, 0.369258200…\n$ word \"N-U-B-R-A\", \"N-Z-P-S-Y\", \"C-G-D-X-H\", NA, \"B-K-A-O-W\", \"H-P-G-…\n$ bool FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, T…\n$ wordZ2 \"FALSE-Z\", \"TRUE-Z\", \"TRUE-Z\", \"FALSE-Z\", \"TRUE-Z\", \"TRUE-Z\", \"…\n```\n:::\n:::\n\n\n# Convert double variables to date and time variables\n\nAnother way to convert a vector of integers is to use the function `convertToDate()` or `convertToDateTime()`.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nother_file <- readxl_example(path = \"type-me.xlsx\")\nxlsxfile_datetime <- read.xlsx(other_file, sheet = 3) |> \n as_tibble() |> \n slice(2:3) |> \n select(`maybe.a.datetime?`) |> \n pull()\nxlsxfile_datetime\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"41051\" \"41026.479166666664\"\n```\n:::\n\n```{.r .cell-code}\nconvertToDate(xlsxfile_datetime[1])\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"2012-05-22\"\n```\n:::\n\n```{.r .cell-code}\nconvertToDateTime(xlsxfile_datetime[2])\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"2012-04-27 11:30:00 CEST\"\n```\n:::\n:::\n\n\n\n# Links\nThese examples are inspired by:\n- [https://rdrr.io/cran/openxlsxhttps://rdrr.io/cran/openxlsx](https://rdrr.io/cran/openxlsx/man/read.xlsx.html)\n\n- [https://readxl.tidyverse.org](https://readxl.tidyverse.org)\n", + "markdown": "---\ntitle: \"Detect date and time variables with openxlsx\"\nauthor: \"Layal C. Lettry\"\ndate: \"2023-10-08\"\ncategories: [openxlsx, date, datetime]\nimage: \"image.jpg\"\n---\n\n\n# Detect date variables\n\nWhen you try to read an excel file, the dates don't always look the way you would expect. You may see a vector of integers (or doubles) rather than a vector of dates. If you are using [openxlsx](https://github.com/ycphs/openxlsx), you can set `detectDates = TRUE` in the function `read.xlsx()`.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(openxlsx)\nlibrary(tidyverse)\nlibrary(readxl)\n```\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\nxlsxfile_path <- system.file(\"extdata\", \"readTest.xlsx\", package = \"openxlsx\")\n\n# Vector of doubles instead of dates\nxlsxfile_with_problems <- read.xlsx(xlsxfile_path, sheet = 3) |> \n as_tibble()\nxlsxfile_with_problems\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n# A tibble: 2,083 × 5\n Date value word bool wordZ2 \n \n 1 41757 0.839 N-U-B-R-A FALSE FALSE-Z\n 2 41756 0.886 N-Z-P-S-Y TRUE TRUE-Z \n 3 41755 0.574 C-G-D-X-H TRUE TRUE-Z \n 4 41754 0.137 FALSE FALSE-Z\n 5 41753 0.369 B-K-A-O-W TRUE TRUE-Z \n 6 41752 NA H-P-G-O-K TRUE TRUE-Z \n 7 41751 0.842 F-P-C-L-T TRUE TRUE-Z \n 8 41750 0.227 A-N-Q-P-V TRUE TRUE-Z \n 9 41749 0.276 Y-E-B-K-O TRUE TRUE-Z \n10 41748 0.419 V-S-N-T-R TRUE TRUE-Z \n# ℹ 2,073 more rows\n```\n:::\n\n```{.r .cell-code}\nglimpse(xlsxfile_with_problems)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nRows: 2,083\nColumns: 5\n$ Date 41757, 41756, 41755, 41754, 41753, 41752, 41751, 41750, 41749, …\n$ value 0.839076400, 0.886380000, 0.574131400, 0.136606500, 0.369258200…\n$ word \"N-U-B-R-A\", \"N-Z-P-S-Y\", \"C-G-D-X-H\", NA, \"B-K-A-O-W\", \"H-P-G-…\n$ bool FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, T…\n$ wordZ2 \"FALSE-Z\", \"TRUE-Z\", \"TRUE-Z\", \"FALSE-Z\", \"TRUE-Z\", \"TRUE-Z\", \"…\n```\n:::\n\n```{.r .cell-code}\n# Vector of dates\nxlsxfile <- read.xlsx(xlsxfile_path, sheet = 3, detectDates = TRUE) |> \n as_tibble()\nxlsxfile\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n# A tibble: 2,083 × 5\n Date value word bool wordZ2 \n \n 1 2014-04-28 0.839 N-U-B-R-A FALSE FALSE-Z\n 2 2014-04-27 0.886 N-Z-P-S-Y TRUE TRUE-Z \n 3 2014-04-26 0.574 C-G-D-X-H TRUE TRUE-Z \n 4 2014-04-25 0.137 FALSE FALSE-Z\n 5 2014-04-24 0.369 B-K-A-O-W TRUE TRUE-Z \n 6 2014-04-23 NA H-P-G-O-K TRUE TRUE-Z \n 7 2014-04-22 0.842 F-P-C-L-T TRUE TRUE-Z \n 8 2014-04-21 0.227 A-N-Q-P-V TRUE TRUE-Z \n 9 2014-04-20 0.276 Y-E-B-K-O TRUE TRUE-Z \n10 2014-04-19 0.419 V-S-N-T-R TRUE TRUE-Z \n# ℹ 2,073 more rows\n```\n:::\n\n```{.r .cell-code}\nglimpse(xlsxfile)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nRows: 2,083\nColumns: 5\n$ Date 2014-04-28, 2014-04-27, 2014-04-26, 2014-04-25, 2014-04-24, 20…\n$ value 0.839076400, 0.886380000, 0.574131400, 0.136606500, 0.369258200…\n$ word \"N-U-B-R-A\", \"N-Z-P-S-Y\", \"C-G-D-X-H\", NA, \"B-K-A-O-W\", \"H-P-G-…\n$ bool FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, T…\n$ wordZ2 \"FALSE-Z\", \"TRUE-Z\", \"TRUE-Z\", \"FALSE-Z\", \"TRUE-Z\", \"TRUE-Z\", \"…\n```\n:::\n:::\n\n\n# Convert double variables to date and time variables\n\nAnother way to convert a vector of integers is to use the function `convertToDate()` or `convertToDateTime()`.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nother_file <- readxl_example(path = \"type-me.xlsx\")\nxlsxfile_datetime <- read.xlsx(other_file, sheet = 3) |> \n as_tibble() |> \n slice(2:3) |> \n select(`maybe.a.datetime?`) |> \n pull()\nxlsxfile_datetime\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"41051\" \"41026.479166666664\"\n```\n:::\n\n```{.r .cell-code}\nconvertToDate(xlsxfile_datetime[1])\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"2012-05-22\"\n```\n:::\n\n```{.r .cell-code}\nconvertToDateTime(xlsxfile_datetime[2])\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"2012-04-27 11:30:00 CEST\"\n```\n:::\n:::\n\n\n\n# Links\nThese examples are inspired by:\n\n- [https://rdrr.io/cran/openxlsxhttps://rdrr.io/cran/openxlsx](https://rdrr.io/cran/openxlsx/man/read.xlsx.html)\n\n- [https://readxl.tidyverse.org](https://readxl.tidyverse.org)\n", "supporting": [], "filters": [ "rmarkdown/pagebreak.lua" diff --git a/_freeze/posts/2023-10-08_rename-columns-lookup/index/execute-results/html.json b/_freeze/posts/2023-10-08_rename-columns-lookup/index/execute-results/html.json new file mode 100644 index 0000000..269ba9f --- /dev/null +++ b/_freeze/posts/2023-10-08_rename-columns-lookup/index/execute-results/html.json @@ -0,0 +1,14 @@ +{ + "hash": "2c1b7cd21ed9922aade253ecf8ae1982", + "result": { + "markdown": "---\ntitle: \"Rename variables in a data frame using an external lookup table\"\nauthor: \"Layal C. Lettry\"\ndate: \"2023-10-08\"\ncategories: [unquote-splice, tidy evaluation, rename, any_of]\nimage: \"image.jpg\"\n---\n\n\n# Rename variables in a data frame using an external lookup table\n\nSuppose that a data frame is present with certain columns that possess the appropriate names, however, the remaining columns require renaming. An existing lookup table is ready to be used for setting new names to these specific columns. \n\nI found the solution by using tidy evaluation tools, namely the unquote-splice `!!!`, and by reading the [article written by Tim Tiefenbach](https://tim-tiefenbach.de/post/2022-rename-columns/#dplyr-tidyverse). \n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(tidyverse)\n```\n:::\n\n\nHere is the data frame with 3 variables, namely `var1`, `var2` and `var4`.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntest_tib <- tribble(\n ~var1, ~var2, ~var4,\n \"x\", \"a\", 1L,\n \"y\", \"b\", 2L,\n \"z\", \"c\", 3L\n)\n```\n:::\n\n\nDefine the lookup table with the new names. Transform this lookup table into a named vector using `deframe()`. Do not forget that the first argument of `deframe()` should be the new names of the variable and the second one should have the actual names.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nnew_names <- tribble(\n ~names_var, ~new_names_var,\n \"var1\", \"Variable 1\",\n \"var2\", \"Variable 2\",\n \"var3\", \"Variable 3\",\n \"var4\", \"Variable 4\"\n)\n\nnew_names_vec <- deframe(select(new_names, new_names_var, names_var))\n```\n:::\n\n\n# Solution using tidy evaluation and base R\n\nOur goal is to unpack the vector of column name pairs that are actually in our data frame. We could achieve this by using unquote-splice `!!!` which will splice the list of names into the dynamic dots `...` of `rename()`.\n\nHowever, the column `var3` is not found. An error appears.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntest_tib |>\n rename(!!!new_names_vec)\n```\n\n::: {.cell-output .cell-output-error}\n```\nError in `rename()`:\n! Can't rename columns that don't exist.\n✖ Column `var3` doesn't exist.\n```\n:::\n:::\n\n\nSelect only the variables which are in the named vector `new_names_vec`.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntest_tib |>\n rename(!!!new_names_vec[new_names_vec %in% names(test_tib)])\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n# A tibble: 3 × 3\n `Variable 1` `Variable 2` `Variable 4`\n \n1 x a 1\n2 y b 2\n3 z c 3\n```\n:::\n:::\n\n\n# Solution using dplyr\n\nInstead of selecting the common variables, you can use `any_of()` which does this selection automatically.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntest_tib |>\n rename(any_of(new_names_vec))\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n# A tibble: 3 × 3\n `Variable 1` `Variable 2` `Variable 4`\n \n1 x a 1\n2 y b 2\n3 z c 3\n```\n:::\n:::\n\n\n\n# Sources\n\nThese examples are inspired by:\n\n- [Article written by Tim Tiefenbach](https://tim-tiefenbach.de/post/2022-rename-columns/#dplyr-tidyverse)\n\n- [https://dcl-prog.stanford.edu/tidy-eval-detailed.html](https://dcl-prog.stanford.edu/tidy-eval-detailed.html)\n\n- [https://adv-r.hadley.nz/quasiquotation.html#unquoting-many-arguments](https://adv-r.hadley.nz/quasiquotation.html#unquoting-many-arguments)\n\n- [https://rlang.r-lib.org/reference/topic-inject.html#splicing-with--1](https://rlang.r-lib.org/reference/topic-inject.html#splicing-with--1)\n\n- [https://rlang.r-lib.org/reference/dyn-dots.html](https://rlang.r-lib.org/reference/dyn-dots.html)\n", + "supporting": [], + "filters": [ + "rmarkdown/pagebreak.lua" + ], + "includes": {}, + "engineDependencies": {}, + "preserve": {}, + "postProcess": true + } +} \ No newline at end of file diff --git a/posts/2023-10-08_datetimes-openxlsx/index.qmd b/posts/2023-10-08_datetimes-openxlsx/index.qmd index f13c69e..5a67dfc 100644 --- a/posts/2023-10-08_datetimes-openxlsx/index.qmd +++ b/posts/2023-10-08_datetimes-openxlsx/index.qmd @@ -2,7 +2,7 @@ title: "Detect date and time variables with openxlsx" author: "Layal C. Lettry" date: "2023-10-08" -categories: [code, openxlsx, date, datetime] +categories: [openxlsx, date, datetime] image: "image.jpg" --- @@ -10,13 +10,19 @@ image: "image.jpg" When you try to read an excel file, the dates don't always look the way you would expect. You may see a vector of integers (or doubles) rather than a vector of dates. If you are using [openxlsx](https://github.com/ycphs/openxlsx), you can set `detectDates = TRUE` in the function `read.xlsx()`. -```{r load_libraries, eval=TRUE, message=FALSE, warning=FALSE} +```{r} +#| label: load_libraries +#| message: false +#| warning: false library(openxlsx) library(tidyverse) library(readxl) ``` -```{r detectdates, eval=TRUE, message=FALSE, warning=FALSE} +```{r} +#| label: detectdates +#| message: false +#| warning: false xlsxfile_path <- system.file("extdata", "readTest.xlsx", package = "openxlsx") # Vector of doubles instead of dates @@ -36,7 +42,10 @@ glimpse(xlsxfile) Another way to convert a vector of integers is to use the function `convertToDate()` or `convertToDateTime()`. -```{r convertodate, eval=TRUE, message=FALSE, warning=FALSE} +```{r} +#| label: convertodate +#| message: false +#| warning: false other_file <- readxl_example(path = "type-me.xlsx") xlsxfile_datetime <- read.xlsx(other_file, sheet = 3) |> as_tibble() |> @@ -50,7 +59,8 @@ convertToDateTime(xlsxfile_datetime[2]) ``` -# Links +# Sources + These examples are inspired by: - [https://rdrr.io/cran/openxlsxhttps://rdrr.io/cran/openxlsx](https://rdrr.io/cran/openxlsx/man/read.xlsx.html) diff --git a/posts/2023-10-08_rename-columns-lookup/image.jpg b/posts/2023-10-08_rename-columns-lookup/image.jpg new file mode 100644 index 0000000..095de21 Binary files /dev/null and b/posts/2023-10-08_rename-columns-lookup/image.jpg differ diff --git a/posts/2023-10-08_rename-columns-lookup/index.qmd b/posts/2023-10-08_rename-columns-lookup/index.qmd new file mode 100644 index 0000000..abc6f67 --- /dev/null +++ b/posts/2023-10-08_rename-columns-lookup/index.qmd @@ -0,0 +1,106 @@ +--- +title: "Rename variables in a data frame using an external lookup table" +author: "Layal C. Lettry" +date: "2023-10-08" +categories: [unquote-splice, tidy evaluation, rename, any_of] +image: "image.jpg" +--- + +# Rename variables in a data frame using an external lookup table + +Suppose that a data frame is present with certain columns that possess the appropriate names, however, the remaining columns require renaming. An existing lookup table is ready to be used for setting new names to these specific columns. + +I found the solution by using tidy evaluation tools, namely the unquote-splice `!!!`, and by reading the [article written by Tim Tiefenbach](https://tim-tiefenbach.de/post/2022-rename-columns/#dplyr-tidyverse). + +```{r} +#| label: load_libraries +#| message: false +#| warning: false +library(tidyverse) +``` + +Here is the data frame with 3 variables, namely `var1`, `var2` and `var4`. + +```{r} +#| label: data +#| message: false +#| warning: false + +test_tib <- tribble( + ~var1, ~var2, ~var4, + "x", "a", 1L, + "y", "b", 2L, + "z", "c", 3L +) +``` + +Define the lookup table with the new names. Transform this lookup table into a named vector using `deframe()`. Do not forget that the first argument of `deframe()` should be the new names of the variable and the second one should have the actual names. + +```{r} +#| label: lookup +#| message: false +#| warning: false + +new_names <- tribble( + ~names_var, ~new_names_var, + "var1", "Variable 1", + "var2", "Variable 2", + "var3", "Variable 3", + "var4", "Variable 4" +) + +new_names_vec <- deframe(select(new_names, new_names_var, names_var)) +``` + +# Solution using tidy evaluation and base R + +Our goal is to unpack the vector of column name pairs that are actually in our data frame. We could achieve this by using unquote-splice `!!!` which will splice the list of names into the dynamic dots `...` of `rename()`. + +However, the column `var3` is not found. An error appears. + +```{r} +#| label: error +#| error: true +#| message: false +#| warning: false + +test_tib |> + rename(!!!new_names_vec) +``` + +Select only the variables which are in the named vector `new_names_vec`. + +```{r} +#| label: base_r_solution +#| message: false +#| warning: false +test_tib |> + rename(!!!new_names_vec[new_names_vec %in% names(test_tib)]) +``` + +# Solution using dplyr + +Instead of selecting the common variables, you can use `any_of()` which does this selection automatically. + +```{r} +#| label: dplyr_solution +#| message: false +#| warning: false +test_tib |> + rename(any_of(new_names_vec)) +``` + + +# Sources + +These examples are inspired by: + +- [Article written by Tim Tiefenbach](https://tim-tiefenbach.de/post/2022-rename-columns/#dplyr-tidyverse) + +- [https://dcl-prog.stanford.edu/tidy-eval-detailed.html](https://dcl-prog.stanford.edu/tidy-eval-detailed.html) + +- [https://adv-r.hadley.nz/quasiquotation.html#unquoting-many-arguments](https://adv-r.hadley.nz/quasiquotation.html#unquoting-many-arguments) + +- [https://rlang.r-lib.org/reference/topic-inject.html#splicing-with--1](https://rlang.r-lib.org/reference/topic-inject.html#splicing-with--1) + +- [https://rlang.r-lib.org/reference/dyn-dots.html](https://rlang.r-lib.org/reference/dyn-dots.html) diff --git a/posts/test/image.jpg b/posts/test/image.jpg deleted file mode 100644 index 04fef3d..0000000 Binary files a/posts/test/image.jpg and /dev/null differ diff --git a/posts/test/index.qmd b/posts/test/index.qmd deleted file mode 100644 index 71b1d06..0000000 --- a/posts/test/index.qmd +++ /dev/null @@ -1,50 +0,0 @@ ---- -title: "test" -author: "Layal C. Lettry" -date: "2023-10-08" -categories: [code, openxlsx, date, datetime] -image: "image.jpg" ---- - -# Detect date variables - -When you try to read an excel file, the dates don't always look the way you would expect. You may see a vector of integers (or doubles) rather than a vector of dates. If you are using [openxlsx](https://github.com/ycphs/openxlsx), you can set `detectDates = TRUE` in the function `read.xlsx()`. - -```{r load_libraries, eval=TRUE, message=FALSE, warning=FALSE} -library(openxlsx) -library(tidyverse) -library(readxl) -``` - -```{r detectdates, eval=TRUE, message=FALSE, warning=FALSE} -xlsxfile_path <- system.file("extdata", "readTest.xlsx", package = "openxlsx") - -# Vector of doubles instead of dates -xlsxfile_with_problems <- read.xlsx(xlsxfile_path, sheet = 3) |> - as_tibble() -xlsxfile_with_problems -glimpse(xlsxfile_with_problems) - -# Vector of dates -xlsxfile <- read.xlsx(xlsxfile_path, sheet = 3, detectDates = TRUE) |> - as_tibble() -xlsxfile -glimpse(xlsxfile) -``` - -# Convert double variables to date and time variables - -Another way to convert a vector of integers is to use the function `convertToDate()` or `convertToDateTime()`. - -```{r convertodate, eval=TRUE, message=FALSE, warning=FALSE} -other_file <- readxl_example(path = "type-me.xlsx") -xlsxfile_datetime <- read.xlsx(other_file, sheet = 3) |> - as_tibble() |> - slice(2:3) |> - select(`maybe.a.datetime?`) |> - pull() -xlsxfile_datetime - -convertToDate(xlsxfile_datetime[1]) -convertToDateTime(xlsxfile_datetime[2]) -```