diff --git a/help.html b/help.html index d5d8f9796..0677d532b 100644 --- a/help.html +++ b/help.html @@ -347,14 +347,14 @@

Why are my changes not taking effect? It’s making my results look

Here we are creating a new object from an existing one:

new_rivers <- sample(rivers, 5)
 new_rivers
-
## [1]  340  330  720 1038  524
+
## [1] 360 392 352 625 460

Using just this will only print the result and not actually change new_rivers:

new_rivers + 1
-
## [1]  341  331  721 1039  525
+
## [1] 361 393 353 626 461

If we want to modify new_rivers and save that modified version, then we need to reassign new_rivers like so:

new_rivers <- new_rivers + 1
 new_rivers
-
## [1]  341  331  721 1039  525
+
## [1] 361 393 353 626 461

If we forget to reassign this can cause subsequent steps to not work as expected because we will not be working with the data that has been modified.


@@ -403,7 +403,7 @@

Error: object ‘X’ not found

Make sure you run something like this, with the <- operator:

rivers2 <- new_rivers + 1
 rivers2
-
## [1]  342  332  722 1040  526
+
## [1] 362 394 354 627 462

diff --git a/modules/Data_Classes/Data_Classes.html b/modules/Data_Classes/Data_Classes.html index 05914e7ef..81aa51358 100644 --- a/modules/Data_Classes/Data_Classes.html +++ b/modules/Data_Classes/Data_Classes.html @@ -223,6 +223,10 @@

## [1] "character"
+

Why is Class important?

+ +

The class of the data tells R how to process the data. For example, it determines whether you can make summary statistics (numbers) or if you can sort alphabetically (characters).

+

General Class Information

There is one useful functions associated with practically all R classes:

@@ -275,9 +279,9 @@

## [1]  TRUE FALSE    NA
-
as.Date(c("2021-06-15", "2021-06-32"))
+

GUT CHECK!

-
## [1] "2021-06-15" NA
+

What is one reason we might want to convert data to numeric? A. So we can take the mean B. So the data looks better C. So our data is correct

Number Subclasses

@@ -546,51 +550,6 @@

  • tibbles show column classes!
  • -

    Two-dimensional data classes

    - -

    Two-dimensional data classes

    - -

    Two-dimensional classes are those we would often use to store data read from a file

    - -
      -
    • a data frame (data.frame or tibble class)

    • -
    • a matrix (matrix class)

      - -
        -
      • also composed of rows and columns
      • -
      • unlike data.frame or tibble, the entire matrix is composed of one R class
      • -
      • for example: all entries are numeric, or all entries are character
      • -
    • -
    - -

    Lists

    - -
      -
    • One other data type that is the most generic are lists.
    • -
    • Can hold vectors, strings, matrices, models, list of other list!
    • -
    • Lists are used when you need to do something repeatedly across lots of data - for example wrangling several similar files at once
    • -
    • Lists are a bit more advanced but you may encounter them when you work with others or look up solutions
    • -
    - -

    Making Lists

    - -
      -
    • Can be created using list()
    • -
    - -
    mylist <- list(c("A", "b", "c"), c(1, 2, 3))
    -mylist
    - -
    ## [[1]]
    -## [1] "A" "b" "c"
    -## 
    -## [[2]]
    -## [1] 1 2 3
    - -
    class(mylist)
    - -
    ## [1] "list"
    -

    Special data classes

    Dates

    @@ -628,34 +587,20 @@

    Note for function ymd: year month day

    -

    Dates are useful!

    - -
    a <- ymd("2021-06-15")
    -b <- ymd("2021-06-18")
    -a - b
    - -
    ## Time difference of -3 days
    - -

    Creating Date class object

    - -

    date() is picky…

    - -
    date("06/15/2021") # This doesn't work, needs to be year month day
    +

    The function must match the format

    -
    ## Error in as.POSIXlt.character(x, tz = tz(x)): character string is not in a standard unambiguous format
    +
    mdy("06/15/2021")
    -

    But we can use the month day year function mdy

    +
    ## [1] "2021-06-15"
    -
    mdy("06/15/2021") # This works
    +
    dmy("15-June-2021")
    ## [1] "2021-06-15"
    -
    mdy("06/15/21") # This works
    +
    ymd("2021-06-15")
    ## [1] "2021-06-15"
    -

    Note for function mdy: month day year

    -

    They right lubridate function needs to be used

    Must match the data format!

    @@ -670,7 +615,15 @@

    ## [1] "2021-06-15"
    -

    Creating POSIXct class object

    +

    Dates are useful!

    + +
    a <- ymd("2021-06-15")
    +b <- ymd("2021-06-18")
    +a - b
    + +
    ## Time difference of -3 days
    + +

    Can also include hours, minutes, seconds

    class("2013-01-24 19:39:07")
    @@ -680,7 +633,7 @@

    ## [1] "2013-01-24 19:39:07 UTC"
    -
    class(ymd_hms("2013-01-24 19:39:07")) # lubridate package
    +
    ymd_hms("2013-01-24 19:39:07") %>% class()
    ## [1] "POSIXct" "POSIXt"
    @@ -688,9 +641,7 @@

    Note for function ymd_hms: year month day hour minute second.

    -

    There are functions in case your data have only date, hour and minute (ymd_hm()) or only date and hour (ymd_h()).

    - -

    In a dataframe

    +

    Class conversion in a dataset

    Note dates are always displayed year month day, even if made with mdy!

    @@ -718,24 +669,55 @@

    ## $ year <dbl> 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2… ## $ month <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1… +

    Other data classes

    + +

    Two-dimensional data classes

    + +

    Two-dimensional classes are those we would often use to store data read from a file * a data frame (data.frame or tibble class) * a matrix (matrix class) * also composed of rows and columns * unlike data.frame or tibble, the entire matrix is composed of one R class * for example: all entries are numeric, or all entries are character

    + +

    Lists

    + +
      +
    • One other data type that is the most generic are lists.
    • +
    • Can hold vectors, strings, matrices, models, list of other list!
    • +
    • Lists are used when you need to do something repeatedly across lots of data - for example wrangling several similar files at once
    • +
    • Lists are a bit more advanced but you may encounter them when you work with others or look up solutions
    • +
    + +

    Making Lists

    + +

    Can be created using list()

    + +
    mylist <- list(c("A", "b", "c"), c(1, 2, 3))
    +mylist
    + +
    ## [[1]]
    +## [1] "A" "b" "c"
    +## 
    +## [[2]]
    +## [1] 1 2 3
    + +
    class(mylist)
    + +
    ## [1] "list"
    +

    Summary

      -
    • two dimensional object classes include: data frames, tibbles, matrices, and lists
    • -
    • matrix has columns and rows but is all one data class
    • +
    • coerce between classes using as.numeric() or as.character()
    • +
    • data frames, tibbles, matrices, and lists are all classes of objects
    • lists can contain multiples of any other class of data including lists!
    • -
    • calendar dates can be represented with the Date class using ymd(), mdy() functions from lubridate package
    • -
    • Make sure you choose the right function for the way the date is formatted!
    • -
    • POSIXct class representing a calendar date with hours, minutes, seconds. Can use ymd_hms() or ymd_hm() or ymd_h()functions from the lubridate package
    • -
    • can then easily subtract Date or POSIXct class variables or pull out aspects like year
    • +
    • calendar dates can be represented with the Date class using ymd(), mdy() functions from lubridate package
    -

    Lab Part 1

    +

    Lab

    🏠 Class Website

    💻 Lab

    +

    See the extra slides for more advanced topics.

    +

    The End

    Image by Gerd Altmann from Pixabay

    diff --git a/modules/Data_Classes/lab/Data_Classes_Lab.Rmd b/modules/Data_Classes/lab/Data_Classes_Lab.Rmd index 50cb89a3b..3437a419f 100644 --- a/modules/Data_Classes/lab/Data_Classes_Lab.Rmd +++ b/modules/Data_Classes/lab/Data_Classes_Lab.Rmd @@ -12,11 +12,7 @@ editor_options: Load all the packages we will use in this lab. ```{r} -library(readr) library(tidyverse) -library(dplyr) -library(lubridate) -library(jhur) ``` Create some data to work with by running the following code chunk. diff --git a/modules/Data_Classes/lab/Data_Classes_Lab_Key.html b/modules/Data_Classes/lab/Data_Classes_Lab_Key.html index 1acacfbc4..7608209c7 100644 --- a/modules/Data_Classes/lab/Data_Classes_Lab_Key.html +++ b/modules/Data_Classes/lab/Data_Classes_Lab_Key.html @@ -170,20 +170,17 @@

    Part 1

    1.1

    Load all the packages we will use in this lab.

    -
    library(readr)
    -library(tidyverse)
    +
    library(tidyverse)
    ## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
    -## ✔ dplyr     1.1.4     ✔ purrr     1.0.2
    +## ✔ dplyr     1.1.4     ✔ readr     2.1.5
     ## ✔ forcats   1.0.0     ✔ stringr   1.5.1
     ## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
     ## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
    +## ✔ purrr     1.0.2     
     ## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
     ## ✖ dplyr::filter() masks stats::filter()
     ## ✖ dplyr::lag()    masks stats::lag()
     ## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
    -
    library(dplyr)
    -library(lubridate)
    -library(jhur)

    Create some data to work with by running the following code chunk.

    set.seed(1234)