The class of the data tells R how to process the data. For example, it determines whether you can make summary statistics (numbers) or if you can sort alphabetically (characters).
+diff --git a/help.html b/help.html index d5d8f9796..0677d532b 100644 --- a/help.html +++ b/help.html @@ -347,14 +347,14 @@
Here we are creating a new object from an existing one:
new_rivers <- sample(rivers, 5)
new_rivers
-## [1] 340 330 720 1038 524
+## [1] 360 392 352 625 460
Using just this will only print the result and not actually change new_rivers
:
new_rivers + 1
-## [1] 341 331 721 1039 525
+## [1] 361 393 353 626 461
If we want to modify new_rivers
and save that modified version, then we need to reassign new_rivers
like so:
new_rivers <- new_rivers + 1
new_rivers
-## [1] 341 331 721 1039 525
+## [1] 361 393 353 626 461
If we forget to reassign this can cause subsequent steps to not work as expected because we will not be working with the data that has been modified.
Make sure you run something like this, with the <-
operator:
rivers2 <- new_rivers + 1
rivers2
-## [1] 342 332 722 1040 526
+## [1] 362 394 354 627 462
## [1] "character"+
The class of the data tells R how to process the data. For example, it determines whether you can make summary statistics (numbers) or if you can sort alphabetically (characters).
+There is one useful functions associated with practically all R classes:
@@ -275,9 +279,9 @@## [1] TRUE FALSE NA-
as.Date(c("2021-06-15", "2021-06-32"))+
## [1] "2021-06-15" NA+
What is one reason we might want to convert data to numeric? A. So we can take the mean B. So the data looks better C. So our data is correct
Two-dimensional classes are those we would often use to store data read from a file
- -a data frame (data.frame
or tibble
class)
a matrix (matrix
class)
data.frame
or tibble
, the entire matrix is composed of one R classnumeric
, or all entries are character
lists
.list()
mylist <- list(c("A", "b", "c"), c(1, 2, 3)) -mylist- -
## [[1]] -## [1] "A" "b" "c" -## -## [[2]] -## [1] 1 2 3- -
class(mylist)- -
## [1] "list"-
Note for function ymd
: year month day
a <- ymd("2021-06-15") -b <- ymd("2021-06-18") -a - b- -
## Time difference of -3 days- -
Date
class objectdate()
is picky…
date("06/15/2021") # This doesn't work, needs to be year month day+
## Error in as.POSIXlt.character(x, tz = tz(x)): character string is not in a standard unambiguous format+
mdy("06/15/2021")-
mdy
## [1] "2021-06-15"-
mdy("06/15/2021") # This works+
dmy("15-June-2021")
## [1] "2021-06-15"-
mdy("06/15/21") # This works+
ymd("2021-06-15")
## [1] "2021-06-15"-
Note for function mdy
: month day year
Must match the data format!
@@ -670,7 +615,15 @@## [1] "2021-06-15"-
POSIXct
class objecta <- ymd("2021-06-15") +b <- ymd("2021-06-18") +a - b+ +
## Time difference of -3 days+ +
class("2013-01-24 19:39:07")@@ -680,7 +633,7 @@
## [1] "2013-01-24 19:39:07 UTC"-
class(ymd_hms("2013-01-24 19:39:07")) # lubridate package+
ymd_hms("2013-01-24 19:39:07") %>% class()
## [1] "POSIXct" "POSIXt"@@ -688,9 +641,7 @@
Note for function ymd_hms
: year month day hour minute second.
There are functions in case your data have only date, hour and minute (ymd_hm()
) or only date and hour (ymd_h()
).
Note dates are always displayed year month day, even if made with mdy
!
Two-dimensional classes are those we would often use to store data read from a file * a data frame (data.frame
or tibble
class) * a matrix (matrix
class) * also composed of rows and columns * unlike data.frame
or tibble
, the entire matrix is composed of one R class * for example: all entries are numeric
, or all entries are character
lists
.Can be created using list()
mylist <- list(c("A", "b", "c"), c(1, 2, 3)) +mylist+ +
## [[1]] +## [1] "A" "b" "c" +## +## [[2]] +## [1] 1 2 3+ +
class(mylist)+ +
## [1] "list"+
as.numeric()
or as.character()
Date
class using ymd()
, mdy()
functions from lubridate
packagePOSIXct
class representing a calendar date with hours, minutes, seconds. Can use ymd_hms()
or ymd_hm()
or ymd_h()
functions from the lubridate
packageDate
or POSIXct
class variables or pull out aspects like yearDate
class using ymd()
, mdy()
functions from lubridate
package💻 Lab
+See the extra slides for more advanced topics.
+Image by Gerd Altmann from Pixabay
diff --git a/modules/Data_Classes/lab/Data_Classes_Lab.Rmd b/modules/Data_Classes/lab/Data_Classes_Lab.Rmd index 50cb89a3b..3437a419f 100644 --- a/modules/Data_Classes/lab/Data_Classes_Lab.Rmd +++ b/modules/Data_Classes/lab/Data_Classes_Lab.Rmd @@ -12,11 +12,7 @@ editor_options: Load all the packages we will use in this lab. ```{r} -library(readr) library(tidyverse) -library(dplyr) -library(lubridate) -library(jhur) ``` Create some data to work with by running the following code chunk. diff --git a/modules/Data_Classes/lab/Data_Classes_Lab_Key.html b/modules/Data_Classes/lab/Data_Classes_Lab_Key.html index 1acacfbc4..7608209c7 100644 --- a/modules/Data_Classes/lab/Data_Classes_Lab_Key.html +++ b/modules/Data_Classes/lab/Data_Classes_Lab_Key.html @@ -170,20 +170,17 @@Load all the packages we will use in this lab.
-library(readr)
-library(tidyverse)
+library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
-## ✔ dplyr 1.1.4 ✔ purrr 1.0.2
+## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
+## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
-library(dplyr)
-library(lubridate)
-library(jhur)
Create some data to work with by running the following code chunk.
set.seed(1234)