diff --git a/docs/slides/slides_data-wrangling_dplyr.html b/docs/slides/slides_data-wrangling_dplyr.html index 76d8e4c..74439e7 100644 --- a/docs/slides/slides_data-wrangling_dplyr.html +++ b/docs/slides/slides_data-wrangling_dplyr.html @@ -444,7 +444,7 @@ - + R for Lunch @@ -1228,7 +1228,7 @@

R for Lunch

-

2024-09-09

+

2024-09-11

Today’s topics

@@ -1289,16 +1289,16 @@

Eat your own dog food

Model how R can work for practical reproducible workflows

Pipes and Assignments

 

- +
-+@@ -1364,7 +1364,7 @@

Wide data

gtExtras::gt_theme_dark()
-
+
@@ -2193,7 +2193,7 @@

Tall data

gtExtras::gt_theme_dark()
-
+
@@ -3042,9 +3042,9 @@

Closing

Pipes and Assignments

 

-
+
-+diff --git a/slides/slides_data-wrangling_dply.pdf b/slides/slides_data-wrangling_dply.pdf deleted file mode 100644 index b3d7475..0000000 Binary files a/slides/slides_data-wrangling_dply.pdf and /dev/null differ diff --git a/slides/slides_data-wrangling_dplyr.html b/slides/slides_data-wrangling_dplyr.html deleted file mode 100644 index 5626a45..0000000 --- a/slides/slides_data-wrangling_dplyr.html +++ /dev/null @@ -1,4343 +0,0 @@ - - - - - - - - - - - - - - - R for Lunch - - - - - - - - - - - - - - - -
-
- -
-

R for Lunch

-

Data wrangling with {dplyr}

- -
-
-
-John Little -
-

- - Duke University Libraries - -

-

- Center for Data & Visualization Sciences -

-
-
- -

2024-01-17

-
-
-

Today’s topics

-
    -
  • Five essential {dplyr} data wrangling verbs

  • -
  • Data pipes inside code-chunks

  • -
-

Yesterday (video)

-
    -
  • Import data

  • -
  • Tour of RStudio IDE

  • -
  • Coding notebooks (Quarto)

  • -
-
-
-

Housekeeping

-
    -
  • Drew / Lauren / breakout rooms
  • -
  • CDVS -
      -
    • Themes -
        -
      • Data Management (Plans, Reproducibility, Repositories)

      • -
      • Data Science

      • -
      • Data Visualization

      • -
      • GIS and Spatial Analysis

      • -
      • Data Sources

      • -
    • -
  • -
-
-
-

Housekeeping continued

- -
-
-

R for Lunch as a series

-

R for Lunch is a series that meets 8 times (till EOM Feb.) After today it will meet regularly on Thursdays at noon.

-
    -
  • Sign-up for each workshop individually

  • -
  • Each episode has a unique zoom link

  • -
-
-
-

Eat your own dog food

-


-Model how R can work for practical reproducible workflows

- -
-
-

Pipes and Assignments

-

 

-
------ - - - - - - - - - - - - - - - - - - - - - - -
OperatorOperator NameKeystorePnuemonic
<-assignmentAlt-dash“Gets value from”

|>
-or

-

%>%

pipeCtrl-Shift-M“And then”
-
-
-
-

Tidyverse and Tidy data

- -
-
-

Foundation

-

 

-

Tidyverse and Quarto is the most practical and well developed, reproducible, scientific analysis and publishing workflow available.

-
-
-

Tidy data1

-
-
-

Tidy data

- -
-
-

Wide data

-
-
-Code -
library(tidyverse)
-library(gt)
-library(gtExtras)
-
-tidyr::relig_income |> 
-  gt::gt_preview() |> 
-  gtExtras::gt_theme_dark()
-
-
-
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
religion<$10k$10-20k$20-30k$30-40k$40-50k$50-75k$75-100k$100-150k>150kDon't know/refused
1Agnostic27346081761371221098496
2Atheist12273752357073597476
3Buddhist27213034335862395354
4Catholic41861773267063811169497926331489
5Don’t know/refused151415111035211718116
6..17
18Unaffiliated217299374365341528407321258597
- -
-
-
-
-
-

Tall data

-
-
-
-
-Code -
relig_income |> 
-  pivot_longer(cols = -religion, 
-               names_to = "income_category", 
-               values_to = "income") |> 
-  gt::gt_preview() |> 
-  gtExtras::gt_theme_dark()
-
-
-
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
religionincome_categoryincome
1Agnostic<$10k27
2Agnostic$10-20k34
3Agnostic$20-30k60
4Agnostic$30-40k81
5Agnostic$40-50k76
6..179
180UnaffiliatedDon't know/refused597
- -
-
-
-
-
-
-Code -
relig_income |> 
-  pivot_longer(cols = -religion, 
-               names_to = "income_category", 
-               values_to = "income") |> 
-  mutate(religion = fct_relevel(religion, "Evangelical Prot", "Mainline Prot", "Catholic", "Unaffiliated", "Historically Black Prot")) |> 
-  mutate(income_category = fct_rev(as_factor(income_category))) |>
-  ggplot(aes(income, income_category)) +
-  geom_col(fill = "#eee8d5") +
-  facet_wrap(vars(
-    fct_other(
-      religion, 
-      keep = c("Evangelical Prot", "Mainline Prot", "Catholic", "Unaffiliated", "Historically Black Prot")))) +
-  theme(plot.background = element_rect(fill = "#002b36"),
-        text = element_text(color = "#eee8d5"),
-        axis.text = element_text(color = "#eee8d5"), 
-        panel.background = element_rect(fill = "#002b36"),
-        panel.grid = element_line(color = "#002b36"),
-        strip.background = element_rect(fill = "#7b9c9f"))
-
-
-
-
-

-
-
-
-
-
-
-
-
-

Code

-

 
-

-
-
relig_income |> 
-  pivot_longer(cols = -religion, names_to = "income_category") |> 
-  ggplot(aes(value, income_category)) +
-  geom_col() +
-  facet_wrap(vars(religion))
-
-
-

Image Credit: apreshill | CC BY 4.0 | https://github.com/apreshill/teachthat/blob/master/pivot/pivot_longer_smaller.gif]

-
-
-
-
-

Polls

- -
-
-

dplyr

-

https://intro2r.library.duke.edu/wrangle.html

-
-
-

We are here to help

- -
- -
-
-

Let’s do it

- -
-
-

Two things for today

- -
-
-

Exercises

-
    -
  1. https://intro2r.library.duke.edu/ > Exercises > Link out > Green Code button > Download ZIP

  2. -
  3. Then, Unzip (i.e. Expand) the folder (on your local file system)

  4. -
  5. Then, double click the rforlunch_exercises.Rproj file

  6. -
  7. From RStudio the Files tab, open the 01_dplyr.qmd

    -
      -
    • The answer file is in the RStudio rforlunch_exercises project > Files Tab > Answers folder
    • -
  8. -
-
-
-
-

Closing

- -
-
-

Pipes and Assignments

-

 

- ------ - - - - - - - - - - - - - - - - - - - - - - -
OperatorOperator NameKeystorePnuemonic
<-assignmentAlt-dash“Gets value from”

|>
-or

-

%>%

pipeCtrl-Shift-M“And then”
-
-
-

Citation management

-

 

-

RStudio > Quarto Notebook > Insert > Citation

-

Example DOI: 10.18637/jss.v059.i10

-
-
-

ai-paired coding

-

 

- -
-
-

Bye for now

- - -
- - -
-

- -
- - - - - - - - - - - - - - - - - - - - - - - - - - \ No newline at end of file diff --git a/slides/slides_data-wrangling_dplyr.qmd b/slides/slides_data-wrangling_dplyr.qmd index 93d2b72..f542801 100644 --- a/slides/slides_data-wrangling_dplyr.qmd +++ b/slides/slides_data-wrangling_dplyr.qmd @@ -77,7 +77,7 @@ Model how R can work for practical reproducible workflows - Code in RStudio -- One kind of report is these slides ([GitHub](https://github.com/libjohn/rforlunch_exercises/blob/main/slides/slides_import-data.pdf "PDF report - slides")) +- One kind of report is these slides (today: [data wrangling](https://libjohn.github.io/rforlunch_exercises/slides/slides_data-wrangling_dplyr.html#/eat-your-own-dog-food) ; yesterday: [import data](https://libjohn.github.io/rforlunch_exercises/slides/slides_import-data.html)) - Another report is the [*Introduction to R/Tidyverse/Quarto* text](https://intro2r.library.duke.edu/). @@ -85,16 +85,16 @@ Model how R can work for practical reproducible workflows   -+----------+---------------+--------------+-------------------+ -| Operator | Operator Name | Keystore | Pnuemonic | -+==========+===============+==============+===================+ -| `<-` | assignment | Alt-dash | "Gets value from" | -+----------+---------------+--------------+-------------------+ -| `|>`\ | pipe | Ctrl-Shift-M | "And then" | -| or | | | | -| | | | | -| `%>%` | | | | -+----------+---------------+--------------+-------------------+ ++-------------+---------------+--------------+-------------------+ +| Operator | Operator Name | Keystore | Pnuemonic | ++=============+===============+==============+===================+ +| `<-` | assignment | Alt-dash | "Gets value from" | ++-------------+---------------+--------------+-------------------+ +| `|>`\ | pipe | Ctrl-Shift-M | "And then" | +| or | | | | +| | | | | +| `%>%` | | | | ++-------------+---------------+--------------+-------------------+ # Tidyverse and Tidy data @@ -238,16 +238,16 @@ Image Credit: apreshill \| CC BY 4.0 \| https://github.com/apreshill/teachthat/b   -+----------+---------------+--------------+-------------------+ -| Operator | Operator Name | Keystore | Pnuemonic | -+==========+===============+==============+===================+ -| `<-` | assignment | Alt-dash | "Gets value from" | -+----------+---------------+--------------+-------------------+ -| `|>`\ | pipe | Ctrl-Shift-M | "And then" | -| or | | | | -| | | | | -| `%>%` | | | | -+----------+---------------+--------------+-------------------+ ++-------------+---------------+--------------+-------------------+ +| Operator | Operator Name | Keystore | Pnuemonic | ++=============+===============+==============+===================+ +| `<-` | assignment | Alt-dash | "Gets value from" | ++-------------+---------------+--------------+-------------------+ +| `|>`\ | pipe | Ctrl-Shift-M | "And then" | +| or | | | | +| | | | | +| `%>%` | | | | ++-------------+---------------+--------------+-------------------+ ## Citation management diff --git a/slides/slides_import-data.pdf b/slides/slides_import-data.pdf deleted file mode 100644 index 4e57b28..0000000 Binary files a/slides/slides_import-data.pdf and /dev/null differ