diff --git a/docs/data-wrangling1.html b/docs/data-wrangling1.html new file mode 100644 index 0000000..177ff33 --- /dev/null +++ b/docs/data-wrangling1.html @@ -0,0 +1,2813 @@ + + + + + + + + + + + Data wrangling + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+
+
+ + + + +

Data wrangling

+ + + + +

Ben Whalley

+

October 2023

+ + + + +
+

+
+

Most time in data analysis is spent ‘tidying up’ data: getting it +into a suitable format to get started. Data scientists have a particular +definition of tidy: Tidy datasets are “easy to manipulate, +model and visualize, and have a specific structure: each variable is a +column, each observation is a row” (Wickham 2014).

+
+
+

It’s often not convenient for humans to enter data in a tidy +way, so untidy data is probably more common than tidy data in the wild. +But doing good, reproducible science demands that we document each step +of our processing in a way that others can check or repeat in future. +Tools like R make this easier.

+
+
+

Overview

+

In the lifesavR +worksheets we used various commands from the tidyverse, +like filter and group_by.

+

If you want to recap these commands you could use the +cheatsheet, especially the part on groups and summaries.

+
+

Today we will cover three additional techniques which are important +when working with real datasets:

+
    +
  1. ‘Pivoting’ or reshaping data from long to wide formats (or the +reverse)
  2. +
  3. Adding meaningful labels to categorical variables
  4. +
+
+
+

Before you start

+
+

Make a new rmd file in your datafluency directory, +called data-wrangling.rmd and record your work in this for +the next 2 sessions.

+
+
+
+

Selecting columns

+

The fuel data also contains variables for weight and +power.

+

We can select just these columns and save them to a smaller dataframe +like this:

+
carperformance <- fuel %>% 
+  select(mpg, weight, power) %>% 
+  head() 
+
+

Explanation of the commands

+
    +
  • On the far left we have the name of the new variable which we will +create: carperformance.
  • +
  • We can tell this will be a new variable because the +<- symbol is just to the right, pointing at it.
  • +
  • To see what carperformance contains, look to the right +of the <-. We pipe the fuel data to the +select command, which selects +thempg,weight, andpower +columns.
  • +
+
+
+

Explanation of the result

+

When running the command you won’t see any output — but a new object +was created called carperformance which contained copies of +the columns from fuel we selected.

+

We can see the first few rows of our new smaller dataframe like +this:

+
carperformance %>% head() 
+
   mpg weight power
+1 21.0   1188   110
+2 21.0   1304   110
+3 22.8   1052    93
+4 21.4   1458   110
+5 18.7   1560   175
+6 18.1   1569   105
+
+

Try selecting columns in a dataset for yourself:

+
    +
  • Use any of the built in datasets, creating a copy with just a subset +of 3 of its columns.
  • +
+
+
+
+
+

Pivoting longer

+

Data is commonly stored in either wide +or long format.

+

If you used SPSS to do a t-test or ANOVA during your +undergraduate degree, you likely stored and analysed the data in +wide format.

+

In wide format, each row represents the observations from a +single participant. Each measurement for a given participant are +stored in separate columns.

+

This is often called row per subject data. An +example is the built in attitude dataset:

+
attitude %>%
+  head()
+
  rating complaints privileges learning raises critical advance
+1     43         51         30       39     61       92      45
+2     63         64         51       54     63       73      47
+3     71         70         68       69     76       86      48
+4     61         63         45       47     54       84      35
+5     81         78         56       66     71       83      47
+6     43         55         49       44     54       49      34
+

Explanation: Each row contains scores for a +particular employee on various measures. To find out more about these +data you can type ?attitude into the console.

+
+

Let’s say we want a single plot of all these variables, something +like this:

+

+

To do this we first need to convert the data to long format. +In long format, each observation is saved in its own +row, rather than across multiple columns.

+

It’s often called “row per observation” data.

+
+

Using pivot_longer()

+
+

Pivoting is where you take a long data file (lots of rows, few +columns) and make it wider. Or where you take a wide data file (lots of +columns, few rows) and make it longer.

+
+

+

We can convert from wide to long using the +pivot_longer() function, as shown in the video:

+

To see why the function is called ‘pivot_longer’, +imagine trying to reshape just the first two rows of the attitude +dataset:

+
  rating complaints privileges learning raises critical advance
+1     43         51         30       39     61       92      45
+2     63         64         51       54     63       73      47
+

If we use pivot_longer on this selection, we end up with +this:

+
attitude %>%
+  head(2) %>% 
+  pivot_longer(everything()) 
+
# A tibble: 14 × 2
+   name       value
+   <chr>      <dbl>
+ 1 rating        43
+ 2 complaints    51
+ 3 privileges    30
+ 4 learning      39
+ 5 raises        61
+ 6 critical      92
+ 7 advance       45
+ 8 rating        63
+ 9 complaints    64
+10 privileges    51
+11 learning      54
+12 raises        63
+13 critical      73
+14 advance       47
+

Explanation of the command: - We selected a subset +of columns and rows. - Then we used +pivot_longer(everything()) to make this into long form +data. - The everything() part tells R to merge values from +all of the columns into a single new column called value, +and to keep track of the original variable name in a new column called +name.

+

The change works like like this:

+
+Converting from wide format to long format +
Converting from wide format to long format
+
+
+

You might have spotted a problem though: We don’t have a record of +which participant was which in the attitude dataset.

+

This is because the mapping to participants was implicit: +each row was a different participant, but participant number was +not actually recorded in the file.

+

We can create an explicit participant identifier by adding a new +column. For this we use the mutate and +row_number() functions:

+
attitude_with_person <- attitude %>%
+  mutate(person = row_number()) %>%
+  head(2) 
+
+attitude_with_person %>% 
+  pander()
+ ++++++++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ratingcomplaintsprivilegeslearningraisescriticaladvanceperson
435130396192451
636451546373472
+

Now we have a column called person which stores the row +number.

+

But this means if we pivot_longer() +again, we will need to tell R which columns we would like to +pivot.

+

If we don’t do this then the person column gets melted +with everything else so we lose track of which response belonged to +which participant, like this:

+
attitude_with_person %>%
+  pivot_longer(everything()) %>% 
+  pander()
+ ++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
namevalue
rating43
complaints51
privileges30
learning39
raises61
critical92
advance45
person1
rating63
complaints64
privileges51
learning54
raises63
critical73
advance47
person2
+

Explanation of the output Because we didn’t tell +pivot_longer which columns we wanted to pivot, it put all +the values into a single new column called value. This +included our participant identifier, person which is not +what we wanted.

+
+

We can exclude person from the pivoting by writing:

+
attitude_with_person %>%
+  pivot_longer(-person) %>% 
+  head() %>% 
+  pander()
+ +++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
personnamevalue
1rating43
1complaints51
1privileges30
1learning39
1raises61
1critical92
+

Explanation of the command and output:

+
    +
  • Here, we still use pivot_longer but this time we put +-person between the parentheses.
  • +
  • The minus sign, -, means don’t include this +variable, so -person ends up meaning include all +columns except person, which is what we wanted.
  • +
  • The output now retains the person column, but pivots +the other variables.
  • +
  • This means we can tell which person provided each datapoint.
  • +
+
+

Use some tidyverse commands you already know +(e.g. select), plus pivot_longer, to produce +this plot using the attitude dataset:

+

+
+ +
    +
  • Check the cheatsheet +if you get stuck
  • +
  • You need to select only the three variables shown
  • +
  • It’s not necessary to create a person identifier for this plot +(although it won’t hurt if you do)
  • +
+
+
+ +
attitude %>%
+  select(rating, complaints, learning) %>%
+  pivot_longer(everything()) %>% 
+  ggplot(aes(name, value)) +
+  geom_boxplot()
+

+
+
+
+
+
+

Pivoting to make summary tables

+

Imagine we want a table of the mean score for each question in the +attitude dataset.

+

This would be fiddly if we just tried to use summarise +on wide format data. But if we use pivot_longer, +group_by and then summarise (in that order) +it’s possible to take the data and make a table like this with 3 +instructions to R:

+ +++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
NameMeanSD
advance42.9310.29
complaints66.613.31
critical74.779.895
learning56.3711.74
privileges53.1312.24
raises64.6310.4
rating64.6312.17
+
+

Combine the pivot_longer, group_by and +summarise commands (in that order) to reproduce the table +above.

+
+ +
    +
  • You want to pivot all of the variables in the +attitude dataset this time
  • +
  • We covered using summarise in the third +lifesavR session here.
  • +
+
+
+
+

We might also want to produce summary statistics per-participant.

+

Using the commands shown above (and remembering to make a new column +to store the participant number with row_number()), +reproduce this table:

+ ++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
personmean(value)
151.57
259.29
369.71
455.57
568.86
+
+ +

To make the table you will need to use the following functions, in +roughly this order:

+
    +
  • mutate
  • +
  • filter
  • +
  • pivot_longer
  • +
  • group_by
  • +
  • summarise
  • +
  • pander
  • +
+
+
+ +
attitude %>% 
+  mutate(person = row_number()) %>% 
+  filter(person < 6) %>% 
+  pivot_longer(-person) %>% 
+  group_by(person) %>% 
+  summarise(mean(value)) %>% 
+  pander()
+
+
+
+
+

Pivoting wider

+

Sometimes we have the opposite problem: We have long data, but want +it in wide format. For example, we might want a table where it’s easy to +compare between different years, like this:

+
development %>%
+  filter(year > 1990) %>% 
+  pivot_wider(id_cols=country, 
+              names_from=year, 
+              values_from=gdp_per_capita) %>% 
+  head(3) %>% 
+  pander::pander("GDP per-capita in 3 countries in 4 different years, from the development dataset.")
+ + +++++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
GDP per-capita in 3 countries in 4 different years, from the +development dataset.
country1992199720022007
Afghanistan649.3635.3726.7974.6
Albania2497319346045937
Algeria5023479752886223
+
+

Instead of making the data longer, now we want to +pivot_wider.

+

The development data is a fairly long format. There are +multiple rows per-country corresponding to different years.

+

We want to compare GDP in different years.

+

We first need to select the data we want — country, +year and GDP, for the years after 1990:

+
development1990s <- development %>%
+  select(country, year, gdp_per_capita) %>%
+  filter(year >= 1990)
+

Then we can pivot_wider():

+
development1990s %>%
+  pivot_wider(
+    names_from = year, 
+    values_from = gdp_per_capita
+  ) %>% 
+  head() %>% 
+  pander()
+ +++++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
country1992199720022007
Afghanistan649.3635.3726.7974.6
Albania2497319346045937
Algeria5023479752886223
Angola2628227727734797
Argentina930810967879812779
Australia23425269983068834435
+

Explanation of the command and output:

+
    +
  • We started with multiple rows per country, corresponding to +years.
  • +
  • We used pivot_wider with names_from = year +to create new columns for each year in the data.
  • +
  • We used values_from=gdp_per_capita to tell +pivot_longer to use the GDP numbers to populate the +table.
  • +
  • The resulting table helps us compare years within countries, or +between countries, for a given year.
  • +
+
+

Use the funimagery dataset in psydata and +perform the following:

+
    +
  • use select to make a dataset with +intervention and each of the kg1 to +kg3 columns
  • +
  • Use pivot_longer, group_by and +summarise to calculate the average weight of participants +at each timepoint
  • +
  • Adapt the group_by function to calculate the mean at +each timepoint for each group separately
  • +
  • Add pivot_wider to the end of your code to create a +separate column for each group.
  • +
+

When you finish your data should look like this:

+ +++++ + + + + + + + + + + + + + + + + + + + + + + + + +
nameMIFIT
kg189.8691.46
kg288.6286.37
kg388.4684.04
+
+ +
funimagery %>% 
+  select(intervention, kg1, kg2, kg3) %>% 
+  pivot_longer(-intervention) %>% 
+  group_by(name, intervention) %>% 
+  summarise(M=mean(value)) %>% 
+  pivot_wider(names_from=intervention, values_from=M) %>% 
+  pander()
+
+
+
+
+

Separating variables

+

Sometimes we need to separate ‘untidy’ variables into tidy, long-form +data.

+

+

The code below generates simulated data for 100 individuals at three +time points. The format is similar to the way you might record +experimental data in a spreadsheet.

+
set.seed(1234)
+N <- 100
+repeatmeasuresdata <- tibble(person = 1:N,
+                              time_1 = rnorm(N),
+                              time_2 = rnorm(N, 1),
+                              time_3 = rnorm(N, 3))
+
+repeatmeasuresdata %>% head(8) %>% 
+  pander()
+ ++++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
persontime_1time_2time_3
1-1.2071.4153.485
20.27740.52533.697
31.0841.0663.186
4-2.3460.49753.701
50.42910.1743.312
60.50611.1673.76
7-0.57470.10374.842
8-0.54661.1684.112
+

This variable, repeatmeasuresdata, is in +wide format. Each row contains data for one +participant, and each participant has three observations.

+

As we saw previously, +we can pivot — i.e., reshape — the data into longer format like +so:

+
repeatmeasuresdata %>%
+  pivot_longer(starts_with("time")) %>%
+  arrange(person, name) %>%
+  head(8) %>% 
+  pander()
+ +++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
personnamevalue
1time_1-1.207
1time_21.415
1time_33.485
2time_10.2774
2time_20.5253
2time_33.697
3time_11.084
3time_21.066
+

The problem we have now is that name contains text which +describes at which time the observation was made. We probably want to +store a number for each time-point, so we can make a plot with +time on the x axis.

+

The separate command separates a single character column +(name) into multiple columns. Rather than have a column +with labels of the form ‘time_1’, it can create two columns, with labels +‘time’ and ‘1’ in each.

+
# convert to long form; extract the `time` as a new numeric column
+longrepeatmeasuresdata <- repeatmeasuresdata %>%
+  pivot_longer(starts_with("time")) %>%
+  separate(name, into = c("variable", "time"))
+
+longrepeatmeasuresdata %>% head %>% 
+  pander()
+ ++++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
personvariabletimevalue
1time1-1.207
1time21.415
1time33.485
2time10.2774
2time20.5253
2time33.697
+

Now the data are in long format, we can plot the points over +time:

+
longrepeatmeasuresdata %>%
+  sample_n(30) %>%
+  ggplot(aes(x=time, y=value)) +
+  geom_point()
+

+
+

How does R know where to split the text?

+

In the example above, separate split data like +"time_1", "time_2" etc into two columns: +variable and time.

+

Q: How did it know to use the underscore (_) to split +the data?

+

A: The default is to split on anything which is not a letter or +number. So _ or a space, or , would all +work.

+

Sometimes though we need to tell R explicitly what to use to sepatate +the values.

+

If we had a column of email addresses we could split +ben.whalley@plymouth.ac.uk into the username +(e.g. ben.whalley) and domain name +(plymouth.ac.uk) using the @ symbol.

+

To do this we just write sep="@" when we use +separate.

+
+

The messy_exp dataset in psydata contains +simulated RT data on 100 participants in 2 conditions (A and B) at three +time points (1, 2, and 3).

+
    +
  • Use the separate() function to split up the +condition variable in this dataset and draw the following +plot:
  • +
+

+
+ +
messy_exp %>% 
+  separate(condition, into=c("participant", "condition", "time")) %>% 
+  ggplot(aes(time, rt, color=condition)) + 
+  geom_boxplot(width=.5) + 
+  labs(x="Time", y="Reaction time (ms)", color="Condition")
+
+
+
+
    +
  1. This file contains sample contact and address data for 100 people: +https://letterhub.com/wp-content/uploads/2018/03/100-contacts.csv
  2. +
+
    +
  • Read the data into R (you can either use the URL above directly +inside the read_csv() function, or download then re-upload +the data to the server to do this)

  • +
  • Use the separate function to make a new variable +which contains the domain name of these contacts’ email address +(e.g. yahoo.com, hotmail.com)

  • +
+
+

Note, you will need to use sep="@" to split the email +addresses at the @ symbol

+
+
    +
  1. Use the distinct and/or count functions on +the new variable you create containing the domain name. Look them up in +the help file if you don’t know which to use to answer these +questions:
  2. +
+
    +
  • How many people had a Gmail account?
  • +
  • Which domains had more than 10 users?
  • +
+
+ +
# read the data directly from the URL
+contacts <- read_csv('https://letterhub.com/wp-content/uploads/2018/03/100-contacts.csv') %>% 
+  separate(email, into=c("user", "domain"), sep ="@")  # uses the @ symbol as a separator
+
# how many _different_ domains are there?
+contacts %>% 
+  distinct(domain) %>% 
+  count() %>% 
+  pander()
+ +++ + + + + + + + + + + +
n
39
+
# how many people use gmail
+contacts %>% 
+  count(domain) %>% 
+  filter(domain=="gmail.com") %>% 
+  pander()
+ ++++ + + + + + + + + + + + + +
domainn
gmail.com16
+
# which domains had more than 10 users?
+contacts %>% 
+  count(domain) %>% 
+  filter(n > 10) %>% 
+  pander()
+ ++++ + + + + + + + + + + + + + + + + + + + + + + + + +
domainn
aol.com13
gmail.com16
hotmail.com13
yahoo.com13
+
+
+
+
+
+

Questionnaire data

+

The file sweets.csv contains a small number of example +rows of data exported from an online survey.

+

The file is at: https://t.ly/H9sDJ

+ +

We can look at the first few rows of the data, using the +glimpse command:

+
sweets <- read_csv('https://t.ly/H9sDJ')
+sweets %>% glimpse()
+
Rows: 4
+Columns: 8
+$ ID                               <dbl> 6, 7, 8, 9
+$ `Start time`                     <dttm> 2019-05-24 09:12:31, 2019-05-24 09:12…
+$ `Completion time`                <dttm> 2019-05-24 09:12:35, 2019-05-24 09:1…
+$ Email                            <chr> "anonymous", "anonymous", "anonymous…
+$ Name                             <lgl> NA, NA, NA, NA
+$ `How much do you like sweets?`   <chr> "I don't like them", "I'm neutral", "…
+$ `How much do you like chocolate` <chr> "I don't like them", "I don't like th…
+$ Gender                           <chr> "M", "F", "M", "F"
+
+
    +
  • Import the sweets data as shown above from: https://t.ly/H9sDJ

  • +
  • Save it to a new variable called sweets

  • +
+
+
+

Tidying questionnaires

+

When we look at the imported data it’s useful to note:

+
    +
  1. There are extra columns we don’t need (at least for +now).

  2. +
  3. Some of our variable names are very long and annoying to type +(for example How much do you like sweets? is the name of +one of our columns).

  4. +
  5. Our responses are in text format, rather than as +numbers. For example, the data say "I don't like them" or +"I'm neutral" rather than numbers from a 1-5 +scale.

  6. +
+

We need to sort each of these problems to make things more manageable +for our analysis.

+
+
+

Selecting and renaming

+
+

Remember, R makes using columns with spaces or other special +characters very hard. We want to avoid this.

+
+
+

Selecting

+

To use columns with spaces in we must ‘escape’ the spaces and +let R know they are part of the name rather than a gap +between two different names.

+

This video shows how (or read below):

+

+

To escape spaces and use columns with long names we use the backtick +character (the backwards facing apostrophe) to wrap the column +name.

+

In general, if your columns contain spaces or other odd +characters like hyphens or question marks then you will need to wrap +them in backticks.

+
+
+

Renaming

+

Some of the imported variable names in the sweets data +are long and awkward to use.

+

Most researchers would rename these variables, to make them more +usable in R code.

+

You can rename variables like this:

+
datasetname %>% 
+  rename(NEW_COLUMN_NAME = OLD_COLUMN_NAME)
+

So for this example:

+
sweets %>%
+  rename(
+    like_sweets = `How much do you like sweets?`,
+    like_chocolate = `How much do you like chocolate`,
+  )
+
# A tibble: 4 × 8
+     ID `Start time`        `Completion time`   Email     Name  like_sweets     
+  <dbl> <dttm>              <dttm>              <chr>     <lgl> <chr>           
+1     6 2019-05-24 09:12:31 2019-05-24 09:12:35 anonymous NA    I don't like th…
+2     7 2019-05-24 09:12:31 2019-05-24 09:12:35 anonymous NA    I'm neutral     
+3     8 2019-05-24 09:12:31 2019-05-24 09:12:35 anonymous NA    I like them     
+4     9 2019-05-24 09:12:31 2019-05-24 09:12:35 anonymous NA    I'm neutral     
+# ℹ 2 more variables: like_chocolate <chr>, Gender <chr>
+

Explanation of the code: We used rename +to change the names of our variables. We needed to wrap the long names +of the questions in ‘backtick’ symbols to make sure R understood it was +a single column name.

+

You should create a new variable to save the renamed dataset (with a +descriptive name for use later on):

+
# create a new variable containing the renamed dataset
+sweets.renamed <- sweets %>%
+  rename(
+    like_sweets = `How much do you like sweets?`,
+)
+
+
    +
  1. Create a copy of the sweets data in which you have selected only +the two columns with long names.

  2. +
  3. Create a second copy of the data where you have renamed the +columns with long names to something short, and without spaces.

  4. +
+
+
+
+

Renaming with the janitor package

+

A good alternative to renaming variables manually is to use the +clean_names function in the janitor +package.

+
sweets %>% 
+  janitor::clean_names() %>% 
+  glimpse
+
Rows: 4
+Columns: 8
+$ id                             <dbl> 6, 7, 8, 9
+$ start_time                     <dttm> 2019-05-24 09:12:31, 2019-05-24 09:12:3…
+$ completion_time                <dttm> 2019-05-24 09:12:35, 2019-05-24 09:12:…
+$ email                          <chr> "anonymous", "anonymous", "anonymous",…
+$ name                           <lgl> NA, NA, NA, NA
+$ how_much_do_you_like_sweets    <chr> "I don't like them", "I'm neutral", "I …
+$ how_much_do_you_like_chocolate <chr> "I don't like them", "I don't like them…
+$ gender                         <chr> "M", "F", "M", "F"
+

Explanation of the code and result. I used the +clean_names function within the janitor +package without using library. I did this by typing +janitor:: and then the name of the function. In the result +clean_names has made a new dataset:

+
    +
  • Removed all special characters
  • +
  • Made everything lower case (easier for R to autocomplete)
  • +
  • Replaced spaces with underscores
  • +
  • Made column names unique (this isn’t always the case with imported +data, but is important for R)
  • +
+

I typically use this function when importing any new data because it +makes the naming and access of columns much more consistent and easier +to remember.

+
+
+
+
+

Recoding text

+

We noticed above that our responses were stored as text labels like +"I don't like them" rather than on a numeric scale. This +makes it hard to use in an analysis.

+

We need to recode the text variables into numeric +versions.

+
+

How to do it

+

First we must tell R what number we want to use for each text label. +That is, we create a mapping of numbers to +labels.

+

This takes a few steps:

+
    +
  1. Check exactly what the text values are which need +to be mapped.
  2. +
  3. Make a mapping variable which assigns each text +value a number value
  4. +
  5. Use the recode function with mutate to +create a new, numeric column
  6. +
+

This video walks you through the steps below:

+

+
+
+

Step 1: Check EXACTLY what text labels we have

+

To check which labels we need to recode, I select the column in +question and use the unique() function.

+
# check exactly what text values are in the dataset?
+sweets %>% 
+  select(`How much do you like sweets?`) %>% 
+  unique() 
+
# A tibble: 3 × 1
+  `How much do you like sweets?`
+  <chr>                         
+1 I don't like them             
+2 I'm neutral                   
+3 I like them                   
+
+

Do the same to find out the possible values of the +How much do you like chocolate column.

+
+ +
sweets %>% 
+  select(`How much do you like chocolate`) %>% 
+  distinct() %>% 
+  pander()
+ +++ + + + + + + + + + + + + + +
How much do you like chocolate
I don’t like them
I’m neutral
+
+
+
+
+

Step 2: Make a mapping variable

+

We do this by creating what R calles a named vector, +which is a special kind of list.

+

To make a named vector we use the the c() function. The +letter c here just stands for ‘combine’ — i.e. ‘combine +these things into a list’.

+

This is a simple example:

+
mapping.list <- c("No" = 0, "Yes" = 1)
+

We could then use this mapping to recode a column of data which +contained the words “No” or “Yes

+


+

A useful trick when creating your own mappings is to use R +to do the formatting for you (see the video above for a demo).

+

Re-using the code from the previous step, we use +unique() to show us the unique values for the +question about sweets.

+

We then pipe the result to the paste() and +cat() functions, like this:

+
# the hack we use as a short-cut to creating a mapping variable
+sweets %>% 
+  select(`How much do you like sweets?`) %>% 
+  unique() %>% 
+  paste() %>% cat()
+
c("I don't like them", "I'm neutral", "I like them")
+

Explanation of the output: Using paste +and cat is a bit of a hack. When we run this code we see +the output +c("I don't like them", "I'm neutral", "I like them"). This +is a list of the values in the sweets data for this +question, formatted in a way that will be useful to us in the next +step.

+

We then copy and paste this output into a NEW code block, and EDIT it +to assign our mappings:

+
preference.mappings <- c("I don't like them" = -1, "I'm neutral" = 0, "I like them" = 1)
+

Explanation of the code: We used the previous output +to create a mapping. By adding the parts which read = -1 +and = 0 etc, we have told R what value we want to assign +for each label.

+
+

Q: How do you know what number values to assign?

+

A: It doesn’t matter, provided:

+
    +
  • The intervals between each options are the same and
  • +
  • Each text value has a different number
  • +
+

So, if we had a Likert-scale ranging from “Completely agree” to +“Completely disagree” in 7 increments, we could score this from +0 - 6 or 1 - 7, or -3 - 3. These +would all be fine.

+
+
+
+

Step 3: Use the mapping variable to recode the column

+

We can use our new mapping with the mutate and +recode functions to make a new column, +containing numbers rather than text:

+
sweets.recoded <-  sweets %>% 
+    rename(
+      like_sweets = `How much do you like sweets?`,
+      like_chocolate = `How much do you like chocolate`,
+    ) %>% 
+    # use recode to convert text response using preference.mappings
+    mutate(
+        like_sweets_numeric =
+            recode(like_sweets, !!!preference.mappings)
+    )
+

We can see this new column if we use glimpse:

+
sweets.recoded %>% glimpse()
+
Rows: 4
+Columns: 9
+$ ID                  <dbl> 6, 7, 8, 9
+$ `Start time`        <dttm> 2019-05-24 09:12:31, 2019-05-24 09:12:31, 2019-05-…
+$ `Completion time`   <dttm> 2019-05-24 09:12:35, 2019-05-24 09:12:35, 2019-05…
+$ Email               <chr> "anonymous", "anonymous", "anonymous", "anonymous"
+$ Name                <lgl> NA, NA, NA, NA
+$ like_sweets         <chr> "I don't like them", "I'm neutral", "I like them"…
+$ like_chocolate      <chr> "I don't like them", "I don't like them", "I'm neu…
+$ Gender              <chr> "M", "F", "M", "F"
+$ like_sweets_numeric <dbl> -1, 0, 1, 0
+

Explanation of the code:

+
    +
  • The start of the first line is sweets.recoded <- +which means make a new variable called sweets.recoded.
  • +
  • Then we use mutate to create a new +column called like_sweets_numeric.
  • +
  • We make this column using recode on the question about +liking sweets.
  • +
  • We use the preference.mappings mapping to specify what +numeric score to give each of the text values.
  • +
+

Watch out for the exclamation marks!!!: +In the code above there are three exclamation marks, !!!, +before the mapping; make sure you do the same.

+
+ +
+

Three exclamation marks: !!!

+

Understanding this isn’t necessary to get on with the course. +Only read this if you are interested!

+

In the code above when we used recode we used three +exclamation marks just before our list.

+

We defined the mapping:

+
likert.responses <- c(
+            "I hate them" = 1,
+            "I don't like them" = 2,
+            "I'm neutral" = 3,
+            "I like them" = 4,
+            "I can't live without them" = 5)
+

And then used it with recode, with the three exclamation +marks.

+
liking_of_sweets_data %>%
+    mutate(like_sweets_numeric = recode(like_sweets_text, !!!likert.responses)) %>% 
+  pander()
+

The reason for this is that recode actually expects us +to specify the mapping for it like this:

+
liking_of_sweets_data %>%
+    mutate(like_sweets_numeric = recode(like_sweets_text,
+        "I hate them" = 1,
+        "I don't like them" = 2 ...))
+

But this means we have to repeat the mapping for each of the +questions. Because all the questions use the same mapping this gets +repetitive, and can lead to errors.

+

The three exclamation marks !!! unpacks the +list for us. So writing !!!likert.responses saves us the +bother of writing it out in full each time.

+
+
+
+
+

Summary/video explanation

+

This is one of the trickiest bits of R code we use. I’ve included an +annotated video of just these steps as a reference below:

+

+
+
    +
  • Use this three-step process to create a recoded version of the +like_chocolate variable.

  • +
  • Remember to watch the video at the start of this section, or the +short version in the green box above, if anything is unclear.

  • +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+
+
+

Combining scores

+

Often questionnaires are designed to make repeated measurements of +the same phenomena, which can then be summed or averaged to create a +more reliable measure.

+

We’ve already seen how mutate() creates a new column. We +can use this again to create the sum of responses to both +questions:

+
sweets.recoded %>%
+  # mutate to create column containing sum of both recoded questions
+  mutate(liking = like_sweets_numeric + like_chocolate_numeric) %>%
+  # select the new variable we created
+  select(ID, liking) %>% 
+  pander()
+ ++++ + + + + + + + + + + + + + + + + + + + + + + + + +
IDliking
6-2
7-1
81
90
+

Explanation of the code:

+
    +
  • We added both the questions which asked about ‘liking’ together. +This created a new column containing the combined score, called +liking.
  • +
  • In the third line we selected only the ID column, plus +the new column we made.
  • +
+
+
+

Using ChatGPT to automate things

+

In the sections above I explained how to tidy up a simple dataset. +However it turns out many of these tasks are easily solved using +ChatGPT.

+

You now have a choice you didn’t have a few years ago:

+
    +
  1. Learn how to do these steps by hand (and sometimes use ChatGPT to +automate the work)
  2. +
  3. Learn just enough to be able to get ChatGPT to write code for you +(and hope it’s correct)
  4. +
+

I would recommend option 1 (at least until LLMs make improvements in +code generation) because it will likely be faster, more accurate and +will promote your understanding.

+

But as a guide, here’s how I might solve this problem with ChatGPT as +an assistant.

+
+

Step 1. Show ChatGPT the data

+

I use these command to output some information about the data, in a +format that’s easy to copy into ChatGPT.

+
sweets <- read_csv('https://t.ly/H9sDJ')
+
+# lists the column names
+sweets %>% names()
+
+sweets %>% glimpse()
+
+# list the unique values in response to this question
+sweets %>% count(`How much do you like sweets?`) %>% 
+  pander()
+
+
+

Step 2: Ask GPT to recode it for us

+

This link shows how I asked ChatGPT to recode the sweets +data:

+

https://chat.openai.com/share/325b1f77-c68c-417a-8e75-76b05a6b9633

+



+

Note that I:

+
    +
  • told it the names of the columns
  • +
  • showed it a sample of the data (pasted in the output of +glimpse)
  • +
  • explicitly listed the values in the columns I wanted to recode +(pasted in output of count)
  • +
  • directly asked for R code
  • +
  • gave instructions to use the tidyverse (this makes the output more +consistent with what you have learned)
  • +
+
+
+

Step 3: Debug and validate the R code GPT provides

+

+

In this example, ChatGPT provided the following code:

+
library(tidyverse)
+
+# Load the sweets dataset (assuming it's already loaded)
+sweets <- read_csv('https://t.ly/H9sDJ')
+
+# Recode the text responses into numeric variables
+sweets <- sweets %>%
+  mutate(Like_sweets = recode(`How much do you like sweets?`,
+    "I don't like them" = 0,
+    "I like them" = 1,
+    "I'm neutral" = 2
+  )) %>%
+  mutate(Like_chocolate = recode(`How much do you like chocolate`,
+    "I don't like them" = 0,
+    "I like them" = 1,
+    "I'm neutral" = 2
+  )) %>%
+  select(-`How much do you like sweets?`, -`How much do you like chocolate`)
+
+# Rename variables with spaces to use underscores and be shorter
+sweets <- sweets %>%
+  rename(
+    ID = `ID`,
+    Start_time = `Start time`,
+    Gender = `Gender`
+  )
+
+# Remove timestamp and email columns
+sweets <- sweets %>%
+  select(-`Start_time`, -`Email`)
+
+# Check the modified dataset
+glimpse(sweets)
+

Amazingly, this actually works!

+

However it’s worth saying that it often doesn’t work on +first try, and it is dependent on writing a good prompt to the model, +and giving it plenty of information about the data and what you want to +achieve.

+

Some tips:

+
    +
  • Think first. Be clear about what you want to achieve.
  • +
  • Explain in detail what the data are like and show examples.
  • +
  • Give explicit instructions.
  • +
  • If code doesn’t work, ask for explanations or corrections.
  • +
  • Break tasks down into smaller steps (ask for less and build up your +script).
  • +
  • Paste in the code you already have and ask for a completion with one +extra step (e.g. something you don’t know how to do).
  • +
  • If you can afford it, use ChatGPT 4 or try new R-specific LLMs as +they are (inevitably) released.
  • +
+

Where there are bugs in ChatGPT’s code, you can

+
    +
  • fix these bugs by hand, or
  • +
  • run the code, report the errors to ChatGPT and ask it to fix them +for you.
  • +
+

The second option has some risks (ChatGPT’s code might +delete your data, although it’s unlikely), and is probably slower. In +this instance it took me quite a while to get ChatGPT to fix the simple +errors, and this required understanding the code to some degree +anyway.

+

As I recommend in the lectures, knowing how to use R well enough to +fix the small errors introduced is likely to be more productive in the +long run. Think of LLMs as your assustant rather than your +replacement.

+
+
+
+

Consolidation activity

+
+

Use this example dataset: https://t.ly/_Tcjo

+
    +
  • Read in the data.
  • +
  • Rename the long column names to something short, and without +spaces
  • +
  • Recode at least three of the columns with data about sleep quality +to be numeric
  • +
  • Save the result of this work in a new variable called +sleep.tidy
  • +
  • Pivot the recoded variables and make a boxplot of responses to these +questions
  • +
  • Create a summary score called sleep_quality which is +the sum of these recoded questions (use mutate)
  • +
  • Create a density plot of this summary score and interpret what you +see (describe the pattern in plain English)
  • +
+
+ +
sleep <- read_csv('https://t.ly/_Tcjo')
+
+# used to check what response values are in each question
+sleep %>% 
+  select(`My sleep is affected by my study commitments`) %>% 
+  unique() %>% paste %>% cat
+
c("Agree", "Somewhat agree", "Somewhat disagree", "Disagree", "Neither agree nor disagree", "Strongly agree", "Strongly disagree")
+
sleep %>% 
+  select(`My electronic device usage negatively affects my sleep`) %>% 
+  unique() %>% paste %>% cat
+
c("Disagree", "Strongly agree", "Somewhat disagree", "Agree", "Somewhat agree", "Neither agree nor disagree", "Strongly disagree")
+
# we will use the same mapping for both questions because they have the same responses
+sleep.map <- c("Agree"=2, 
+               "Somewhat agree"=1, 
+               "Somewhat disagree"=-1, 
+               "Disagree"=-2, 
+               "Neither agree nor disagree"=0, 
+               "Strongly agree"=3, 
+               "Strongly disagree"=-3)
+
+
+sleep.tidy <- sleep %>% 
+  # now we recode the two text variables (we only need use mutate once though)
+  mutate(
+    sleep_study = recode(`My sleep is affected by my study commitments`, !!!sleep.map), 
+    sleep_electronic = recode(`My electronic device usage negatively affects my sleep`, !!!sleep.map)
+  )
+
# now we can pivot longer to make a plot
+sleep.tidy %>% 
+  pivot_longer(c(sleep_study, sleep_electronic)) %>% 
+  ggplot(aes(name, value)) + geom_boxplot()
+

+

And make a summary score combining both questions

+
sleep.tidy.withsumary <- sleep.tidy %>% 
+  # and create the summary score
+  mutate(sleep_quality = sleep_study + sleep_electronic ) 
+  
+# check the result. it looks ok
+sleep.tidy.withsumary %>% glimpse
+
Rows: 241
+Columns: 13
+$ uniqueid                                                                                    <chr> …
+$ `Start time`                                                                                <dttm> …
+$ `Completion time`                                                                           <dttm> …
+$ `My sleep is affected by my study commitments`                                              <chr> …
+$ `I achieve good quality sleep`                                                              <chr> …
+$ `My electronic device usage negatively affects my sleep`                                    <chr> …
+$ `Tiredness interferes with my concentration`                                                <chr> …
+$ `My sleep is disturbed by external factors e.g. loud cars, housemates, lights, children...` <chr> …
+$ `I often achieve eight hours of sleep`                                                      <chr> …
+$ `I regularly stay up past 11pm`                                                             <chr> …
+$ sleep_study                                                                                 <dbl> …
+$ sleep_electronic                                                                            <dbl> …
+$ sleep_quality                                                                               <dbl> …
+
# finally, make the requested density plot
+# make the density plot
+sleep.tidy.withsumary %>% 
+  ggplot(aes(sleep_quality)) + 
+  geom_density()
+

+
+
+
+
+

Check your knowledge

+
    +
  • What does it mean for data to be “tidy”? Identify the three key +characteristics

  • +
  • What function creates a new column in a dataset?

  • +
  • Which function allows you to choose a subset of the columns in a +dataset?

  • +
  • What does pivot_longer do?

  • +
  • Why is pivot_longer useful when we want to make a +faceted plot?

  • +
  • Why is wide data sometimes more useful than long data?

  • +
  • If you have RT data with 10 groups, what tidyverse ‘verbs’ +(functions) would you use to calculate the mean for each group?

  • +
  • How can you read data into a dataframe from over the +internet?

  • +
  • Does recode convert from text to numeric, or from +numeric to text values?

  • +
  • Why is it important to recode text variables into numeric values +when working with survey or questionnaire data?

  • +
  • Why is it important to copy and paste exact values when making a +mapping variable for recode?

  • +
  • When renaming, is it: rename(oldvarname=newvarname) +or rename(newvarname=oldvarname)?

  • +
  • What does the clean_names function from the janitor +package do?

  • +
  • Imagine a single cell of your dataset contains the string +“conditionA_time_1”. What function should we apply to it?

  • +
+
+
+Wickham, Hadley. 2014. “Tidy Data.” +Journal of Statistical Software 59 (1): 1–23. https://doi.org/10.18637/jss.v059.i10. +
+
+
+
+ + + +
+ +
+
+ + + + + + + + + + diff --git a/docs/index.html b/docs/index.html index 5b0929d..090449e 100644 --- a/docs/index.html +++ b/docs/index.html @@ -874,8 +874,7 @@

Part 1: Learning R

Part 2: Data handling and visualisation

diff --git a/website/_first_chunk.R b/website/_first_chunk.R index 32d3e4f..d408d2c 100644 --- a/website/_first_chunk.R +++ b/website/_first_chunk.R @@ -10,12 +10,12 @@ library(psydata) knitr::opts_chunk$set( echo = TRUE, - collapse = F, + collapse = FALSE, comment = NA, - cache = FALSE, - message = FALSE + message = FALSE, + include=T ) -options(dplyr.summarise.inform = FALSE) +# options(dplyr.summarise.inform = FALSE) makermds <- function(video_data){ identifier <- video_data$identifier diff --git a/website/data-wrangling1.rmd b/website/data-wrangling1.rmd index 17887da..4539354 100644 --- a/website/data-wrangling1.rmd +++ b/website/data-wrangling1.rmd @@ -1,7 +1,7 @@ --- -title: 'Data wrangling 1' +title: 'Data wrangling' author: 'Ben Whalley' -date: "November 2021" +date: "October 2023" bibliography: [references.bib] biblio-style: apa6 link-citations: yes @@ -64,7 +64,9 @@ The `fuel` data also contains variables for weight and power. We can select just these columns and save them to a smaller dataframe like this: ```{r} -carperformance <- fuel %>% select(mpg, weight, power) +carperformance <- fuel %>% + select(mpg, weight, power) %>% + head() ``` #### Explanation of the commands @@ -80,8 +82,8 @@ When running the command you won't see any output --- but a new object was creat We can see the first few rows of our new smaller dataframe like this: -```{r} -carperformance %>% head() +```{r, include=T} +carperformance %>% head() ``` @@ -90,7 +92,7 @@ carperformance %>% head() Try selecting columns in a dataset for yourself: -- Use any of the built in dataset, creating a copy with just a subset of 3 of its columns. +- Use any of the built in datasets, creating a copy with just a subset of 3 of its columns. ::: @@ -142,11 +144,6 @@ It's often called **"row per observation"** data. ### Using `pivot_longer()` - - -![Another term used is `melting` the data img: [TrueWarrior](https://www.reddit.com/r/gifs/comments/ppam4/ice_cream_melting_and_remelting/)](https://i.imgur.com/UBGhu.gif) - - > Pivoting is where you take a long data file (lots of rows, few columns) and make it wider. Or where you take a wide data file (lots of columns, few rows) and make it longer. @@ -158,15 +155,15 @@ To see why the function is called 'pivot_**longer**', imagine trying to reshape ```{r echo=F, message=F, warning=F} attitude %>% - head(2) + head(2) ``` If we use `pivot_longer` on this selection, we end up with this: -```{r message=F, warning=F} +```{r message=FALSE, warning=FALSE} attitude %>% head(2) %>% - pivot_longer(everything()) + pivot_longer(everything()) ``` **Explanation of the command**: @@ -193,9 +190,10 @@ We can create an explicit participant identifier by adding a new column. For thi ```{r} attitude_with_person <- attitude %>% mutate(person = row_number()) %>% - head(2) + head(2) -attitude_with_person +attitude_with_person %>% + pander() ``` Now we have a column called `person` which stores the row number. @@ -209,7 +207,8 @@ so we lose track of which response belonged to which participant, like this: ```{r} attitude_with_person %>% - pivot_longer(everything()) + pivot_longer(everything()) %>% + pander() ``` **Explanation of the output** Because we didn't tell `pivot_longer` which columns we wanted to pivot, it put all the values into a single new column called `value`. This included our participant identifier, `person` which is not what we wanted. @@ -222,7 +221,8 @@ We can exclude `person` from the pivoting by writing: ```{r} attitude_with_person %>% pivot_longer(-person) %>% - head() + head() %>% + pander() ``` **Explanation of the command and output**: @@ -286,7 +286,8 @@ This would be fiddly if we just tried to use `summarise` on wide format data. Bu attitude %>% pivot_longer(everything()) %>% group_by(Name=name) %>% - summarise(Mean = mean(value), SD=sd(value)) + summarise(Mean = mean(value), SD=sd(value)) %>% + pander() ``` :::{.exercise} @@ -320,7 +321,8 @@ attitude %>% filter(person < 6) %>% pivot_longer(-person) %>% group_by(person) %>% - summarise(mean(value)) + summarise(mean(value)) %>% + pander() ``` @@ -392,7 +394,7 @@ We want to _compare_ GDP in different _years_. We first need to select the data we want --- `country`, `year` and `GDP`, for the years after 1990: -```{r, message=F, warning=F} +```{r, message=F} development1990s <- development %>% select(country, year, gdp_per_capita) %>% filter(year >= 1990) @@ -407,7 +409,8 @@ development1990s %>% names_from = year, values_from = gdp_per_capita ) %>% - head() + head() %>% + pander() ``` **Explanation of the command and output**: @@ -431,13 +434,14 @@ Use the `funimagery` dataset in `psydata` and perform the following: When you finish your data should look like this: -```{r, echo=F} +```{r, echo=F, include=T} funimagery %>% select(intervention, kg1, kg2, kg3) %>% pivot_longer(-intervention) %>% group_by(name, intervention) %>% summarise(M=mean(value)) %>% - pivot_wider(names_from=intervention, values_from=M) + pivot_wider(names_from=intervention, values_from=M) %>% + pander() ``` @@ -450,67 +454,907 @@ funimagery %>% pivot_longer(-intervention) %>% group_by(name, intervention) %>% summarise(M=mean(value)) %>% - pivot_wider(names_from=intervention, values_from=M) + pivot_wider(names_from=intervention, values_from=M) %>% + pander() +``` + + +`r unhide()` + + + +::: + + + + + + +# Separating variables + +Sometimes we need to separate 'untidy' variables into tidy, long-form data. + +`r embed_youtube('NRaKlYGaXEs')` + +The code below generates simulated data for 100 individuals at three time points. The format is +similar to the way you might record experimental data in a spreadsheet. + +```{r} +set.seed(1234) +N <- 100 +repeatmeasuresdata <- tibble(person = 1:N, + time_1 = rnorm(N), + time_2 = rnorm(N, 1), + time_3 = rnorm(N, 3)) + +repeatmeasuresdata %>% head(8) %>% + pander() +``` + + + +This variable, `repeatmeasuresdata`, is in **wide** format. Each row contains data for one participant, and each participant has three observations. + +As [we saw previously](data-wrangling1.html#pivotlonger), we can *pivot* --- i.e., reshape --- the data into longer format like so: + +```{r} +repeatmeasuresdata %>% + pivot_longer(starts_with("time")) %>% + arrange(person, name) %>% + head(8) %>% + pander() +``` + +The problem we have now is that `name` contains text which describes at which time the +observation was made. We probably want to store a *number* for each time-point, so we can make a plot with time +on the x axis. + +The `separate` command separates a single character column (`name`) into multiple columns. +Rather than have a column with labels of the form 'time_1', it can create two columns, with labels +'time' and '1' in each. + +```{r} +# convert to long form; extract the `time` as a new numeric column +longrepeatmeasuresdata <- repeatmeasuresdata %>% + pivot_longer(starts_with("time")) %>% + separate(name, into = c("variable", "time")) + +longrepeatmeasuresdata %>% head %>% + pander() +``` + + +Now the data are in long format, we can plot the points over time: + +```{r} +longrepeatmeasuresdata %>% + sample_n(30) %>% + ggplot(aes(x=time, y=value)) + + geom_point() +``` + + +### How does R know where to split the text? + + +In the example above, `separate` split data like `"time_1"`, `"time_2"` etc into two columns: `variable` and `time`. + +Q: How did it know to use the underscore (`_`) to split the data? + +A: The default is to split on anything which is not a letter or number. So `_` or a space, or `,` would all work. + + +Sometimes though we need to tell R explicitly what to use to sepatate the values. + +If we had a column of email addresses we could split `ben.whalley@plymouth.ac.uk` into the username (e.g. `ben.whalley`) and domain name (`plymouth.ac.uk`) using the `@` symbol. + +To do this we just write `sep="@"` when we use separate. + + +:::{.exercise} + +The `messy_exp` dataset in `psydata` contains simulated RT data on 100 participants in 2 conditions (A and B) at three time points (1, 2, and 3). + +- Use the `separate()` function to split up the `condition` variable in this dataset and draw the following plot: + +```{r, echo=F} +messy_exp %>% + separate(condition, into=c("participant", "condition", "time")) %>% + ggplot(aes(time, rt, color=condition)) + + geom_boxplot(width=.5) + + labs(x="Time", y="Reaction time (ms)", color="Condition") +``` +`r hide("Show the code")` + +```{r, echo=T, eval=F} +messy_exp %>% + separate(condition, into=c("participant", "condition", "time")) %>% + ggplot(aes(time, rt, color=condition)) + + geom_boxplot(width=.5) + + labs(x="Time", y="Reaction time (ms)", color="Condition") +``` + +`r unhide()` + +::: + + + +:::{.exercise} + + +1. This file contains sample contact and address data for 100 people: + + + - Read the data into R (you can either use the URL above directly inside the `read_csv()` function, or download then re-upload the data to the server to do this) + + - Use the `separate` function to make a new variable which contains the *domain name* of these contacts' email address (e.g. yahoo.com, hotmail.com) + +> Note, you will need to use `sep="@"` to split the email addresses at the `@` symbol + +2. Use the `distinct` and/or `count` functions on the new variable you create containing the domain name. Look them up in the help file if you don't know which to use to answer these questions: + + - How many people had a Gmail account? + - Which domains had more than 10 users? + + +`r hide("Show workings")` + +```{r} +# read the data directly from the URL +contacts <- read_csv('https://letterhub.com/wp-content/uploads/2018/03/100-contacts.csv') %>% + separate(email, into=c("user", "domain"), sep ="@") # uses the @ symbol as a separator +``` + +```{r} +# how many _different_ domains are there? +contacts %>% + distinct(domain) %>% + count() %>% + pander() ``` +```{r} +# how many people use gmail +contacts %>% + count(domain) %>% + filter(domain=="gmail.com") %>% + pander() +``` + +```{r} +# which domains had more than 10 users? +contacts %>% + count(domain) %>% + filter(n > 10) %>% + pander() +``` `r unhide()` +::: + + + +# Questionnaire data + + +```{r, echo=T, eval=F, include=F} +# import an excel file using rio +library(rio) +sweets <- import('data/sweets.xlsx') +sweets %>% write_csv('data/sweets.csv') +``` + +The file `sweets.csv` contains a small number of example rows of data exported from an online survey. + +The file is at: + + + + + +We can look at the first few rows of the data, using the `glimpse` command: + + +```{r} +sweets <- read_csv('https://t.ly/H9sDJ') +sweets %>% glimpse() +``` + + +:::{.exercise} + + +- Import the sweets data as shown above from: + +- Save it to a new variable called `sweets` ::: -# Using RMarkdown +## Tidying questionnaires + +When we look at the imported data it's useful to note: + +1. There are extra columns we don't need (at least for now). + +2. Some of our variable names are very long and annoying to type (for example + `How much do you like sweets?` is the name of one of our columns). + +3. Our responses are in **text** format, rather than as numbers. For example, + the data say `"I don't like them"` or `"I'm neutral"` rather than numbers + from a 1-5 scale. + + + +We need to sort each of these problems to make things more manageable for our +analysis. + + + +## Selecting and renaming + +> Remember, R makes using columns with spaces or other special characters very hard. We want to avoid this. + +### Selecting + +To use columns with spaces in we must 'escape' the spaces and **let R know they are part of the name** rather than a gap between two different names. + +This video shows how (or read below): + +`r embed_youtube('aIMgsj5hTVA')` + + +To escape spaces and use columns with long names we use the backtick character (the +backwards facing apostrophe) to *wrap* the column name. + +**In general, if your columns contain spaces or other odd characters like hyphens or question marks then you will need to wrap them in backticks.** + + + +### Renaming {#renaming} -Watch this video on making tables and knitting your RMarkdown document. +Some of the imported variable names in the `sweets` data are long and awkward to use. -`r embed_youtube('XM7nxHIGeJU')` +Most researchers would rename these variables, to make them more usable in R code. + +You can rename variables like this: + +```{r, eval=F} +datasetname %>% + rename(NEW_COLUMN_NAME = OLD_COLUMN_NAME) +``` + +So for this example: + +```{r} +sweets %>% + rename( + like_sweets = `How much do you like sweets?`, + like_chocolate = `How much do you like chocolate`, + ) +``` + +**Explanation of the code**: We used `rename` to change the names of our +variables. We needed to wrap the long names of the questions in 'backtick' +symbols to make sure R understood it was a single column name. + +You should create a new variable to save the renamed dataset (with a descriptive name for use later on): + +```{r} +# create a new variable containing the renamed dataset +sweets.renamed <- sweets %>% + rename( + like_sweets = `How much do you like sweets?`, +) +``` :::{.exercise} -Create an RMarkdown document (if you haven't already) to save your work from today's session. +1. Create a copy of the sweets data in which you have selected only the two columns with long names. -- Knit this document to HTML format, and as a PDF. -- Try opening the HTML document in Word. +1. Create a second copy of the data where you have renamed the columns with long names to something short, and without spaces. ::: +### Renaming with the `janitor` package + + +A good alternative to renaming variables manually is to use the `clean_names` function in +the `janitor` package. + +```{r} +sweets %>% + janitor::clean_names() %>% + glimpse +``` + +**Explanation of the code and result**. I used the `clean_names` function within the `janitor` package without using `library`. I did this by typing `janitor::` and then the name of the function. In the result `clean_names` has made a new dataset: + +- Removed all special characters +- Made everything lower case (easier for R to autocomplete) +- Replaced spaces with underscores +- Made column names unique (this isn't always the case with imported data, but is important for R) + + +I typically use this function when importing any new data because it makes the naming and access of columns much more consistent and easier to remember. + + + + +# Recoding text {#using-recode} + +We noticed above that our responses were stored as text labels like +`"I don't like them"` rather than on a numeric scale. This makes it hard to use in +an analysis. + +**We need to _recode_ the text variables into numeric versions.** -This video shows how to control the output in an RMarkdown document: +### How to do it -`r embed_youtube('GGghELcv-As')` +First we must tell R what number we want to use for each text label. That is, we +create a **_mapping of numbers to labels_**. +This takes a few steps: + +1. Check **exactly** what the text values are which need to be mapped. +2. Make a **mapping variable** which assigns each text value a number value +3. Use the `recode` function with `mutate` to create a new, numeric column + + +This video walks you through the steps below: + + +`r embed_youtube('vaGrKPIHN4Q')` + + +### Step 1: Check EXACTLY what text labels we have + +To check which labels we need to recode, I select the column in question and use the `unique()` function. + +```{r} +# check exactly what text values are in the dataset? +sweets %>% + select(`How much do you like sweets?`) %>% + unique() +``` :::{.exercise} -- Try using the `echo` and `results` chunk options in your RMarkdown document. Knit each time, and note the changes. +Do the same to find out the possible values of the `How much do you like chocolate` column. + +`r hide("Show answer")` + +```{r} +sweets %>% + select(`How much do you like chocolate`) %>% + distinct() %>% + pander() +``` + +`r unhide()` + +::: + + + +### Step 2: Make a mapping variable + +We do this by creating what R calles a **named vector**, which is a special kind of list. + +To make a named vector we use the the `c()` function. The letter `c` +here just stands for 'combine' --- i.e. 'combine these things into a list'. + +This is a simple example: -- Make a figure using ggplot. Change the `fig.width` and `fig.height` values to note the change to the output. +```{r} +mapping.list <- c("No" = 0, "Yes" = 1) +``` + + +We could then use this mapping to recode a column of data which contained the words "No" or "Yes + + +
+ + +*A useful trick* when creating your own mappings is to use R to do the formatting for you (see the video above for a demo). + +Re-using the code from the previous step, we use `unique()` to show us the *unique values* for the question about sweets. + +We then pipe the result to the `paste()` and `cat()` functions, like this: + +```{r} +# the hack we use as a short-cut to creating a mapping variable +sweets %>% + select(`How much do you like sweets?`) %>% + unique() %>% + paste() %>% cat() +``` + +**Explanation of the output**: Using `paste` and `cat` is a bit of a hack. When we run this code we see the output `c("I don't like them", "I'm neutral", "I like them")`. This is a list of the values in the `sweets` data for this question, formatted in a way that will be useful to us in the next step. + + +We then copy and paste this output into a NEW code block, and EDIT it to assign our mappings: + +```{r} +preference.mappings <- c("I don't like them" = -1, "I'm neutral" = 0, "I like them" = 1) +``` + +**Explanation of the code**: We used the previous output to create a mapping. By adding the parts which read `= -1` and `= 0` etc, we have told R what value we want to assign for each label. + + + +:::{.tip} + +Q: How do you know what number values to assign? + +A: It doesn't matter, provided: + +- The intervals between each options are the same and +- Each text value has a different number + +So, if we had a Likert-scale ranging from "Completely agree" to "Completely disagree" in 7 increments, we could score this from `0 - 6` or `1 - 7`, or `-3 - 3`. These would all be fine. + + +::: + + + +### Step 3: Use the mapping variable to recode the column + +We can use our new mapping with the `mutate` and `recode` functions to make a **new column**, containing numbers rather than text: + +```{r} +sweets.recoded <- sweets %>% + rename( + like_sweets = `How much do you like sweets?`, + like_chocolate = `How much do you like chocolate`, + ) %>% + # use recode to convert text response using preference.mappings + mutate( + like_sweets_numeric = + recode(like_sweets, !!!preference.mappings) + ) +``` + +We can see this new column if we use `glimpse`: + +```{r} +sweets.recoded %>% glimpse() +``` + + +**Explanation of the code**: + +- The start of the first line is `sweets.recoded <-` which means make a new variable called `sweets.recoded`. +- Then we use `mutate` to create a **new** column called `like_sweets_numeric`. +- We make this column using `recode` on the question about liking sweets. +- We use the `preference.mappings` mapping to specify what numeric score to give each of the text values. + +**Watch out for the exclamation marks`!!!`**: In the code above there are +three exclamation marks, `!!!`, before the mapping; make sure you do the same. + + +`r hide('Optional: Explain the 3 exclamation marks')` + + +### Three exclamation marks: `!!!` {#explain-exclamationmarks} + +*Understanding this isn't necessary to get on with the course. Only read +this if you are interested!* + +In the code above when we used `recode` we used three exclamation marks just +before our list. + +We defined the mapping: +```{r, eval=F} +likert.responses <- c( + "I hate them" = 1, + "I don't like them" = 2, + "I'm neutral" = 3, + "I like them" = 4, + "I can't live without them" = 5) + +``` + +And then used it with `recode`, with the three exclamation marks. + +```{r, eval=F} +liking_of_sweets_data %>% + mutate(like_sweets_numeric = recode(like_sweets_text, !!!likert.responses)) %>% + pander() +``` + +The reason for this is that `recode` actually expects us to specify the mapping +for it like this: + +```{r, eval=F} +liking_of_sweets_data %>% + mutate(like_sweets_numeric = recode(like_sweets_text, + "I hate them" = 1, + "I don't like them" = 2 ...)) +``` + +But this means we have to repeat the mapping for each of the questions. Because +all the questions use the same mapping this gets repetitive, and can lead to +errors. + +The three exclamation marks `!!!` _unpacks_ the list for us. So writing +`!!!likert.responses` saves us the bother of writing it out in full each time. + + +`r unhide()` + + + +### Summary/video explanation + +This is one of the trickiest bits of R code we use. I've included an annotated video of just these steps as a reference below: + +`r embed_youtube('DCeAlHUZsF0')` + + + + +:::{.exercise} + +- Use this three-step process to create a recoded version of the `like_chocolate` variable. + +- Remember to watch the video at the start of this section, or the short version in the green box above, if anything is unclear. ::: -:::{#tenminmarkdown .exercise} -Check these resources: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +# Combining scores {#summary-score-of-questionnaire} + +Often questionnaires are designed to make repeated measurements of the same phenomena, which can then be summed or averaged to +create a more reliable measure. + +We've already seen how `mutate()` creates a new column. We can use this again to create the *sum* of +responses to both questions: -- [Markdown 'cheatsheet'](https://commonmark.org/help/) -- Do the '10 minute markdown' tutorial here: + +```{r, include=F, echo=F} +# included because we commented out stuff above +sweets.recoded <- sweets.recoded %>% + # create two columns at once using mutate + mutate( + like_sweets_numeric = recode(like_sweets, !!!preference.mappings), + like_chocolate_numeric = recode(like_chocolate, !!!preference.mappings) + ) +``` + + +```{r} +sweets.recoded %>% + # mutate to create column containing sum of both recoded questions + mutate(liking = like_sweets_numeric + like_chocolate_numeric) %>% + # select the new variable we created + select(ID, liking) %>% + pander() +``` + + +**Explanation of the code**: + +- We added both the questions which asked about 'liking' together. This created a new column containing the combined score, called `liking`. +- In the third line we selected only the `ID` column, plus the new column we made. + + + + +# Using ChatGPT to automate things + +In the sections above I explained how to tidy up a simple dataset. However +it turns out many of these tasks are easily solved using ChatGPT. + +You now have a choice you didn't have a few years ago: + +1. Learn how to do these steps by hand (and sometimes use ChatGPT to automate the work) +2. Learn just enough to be able to get ChatGPT to write code for you (and hope it's correct) + +I would recommend option 1 (at least until LLMs make improvements in code generation) because it will likely be faster, more accurate and will promote your understanding. + +But as a guide, here's how I might solve this problem with ChatGPT as an assistant. + + +#### Step 1. Show ChatGPT the data + +I use these command to output some information about the data, in a format that's easy to copy into ChatGPT. + +```{r, eval=F} +sweets <- read_csv('https://t.ly/H9sDJ') + +# lists the column names +sweets %>% names() + +sweets %>% glimpse() + +# list the unique values in response to this question +sweets %>% count(`How much do you like sweets?`) %>% + pander() +``` + + + + +#### Step 2: Ask GPT to recode it for us + +This link shows how I asked ChatGPT to recode the `sweets` data: + +https://chat.openai.com/share/325b1f77-c68c-417a-8e75-76b05a6b9633 + +

+ +Note that I: + +- told it the names of the columns +- showed it a sample of the data (pasted in the output of `glimpse`) +- explicitly listed the values in the columns I wanted to recode (pasted in output of `count`) +- directly asked for R code +- gave instructions to use the tidyverse (this makes the output more consistent with what you have learned) + + + +#### Step 3: Debug and validate the R code GPT provides + + +![](images/recodewithchatgpt.png) + +In this example, ChatGPT provided the following code: + + +```{r, eval=F} +library(tidyverse) + +# Load the sweets dataset (assuming it's already loaded) +sweets <- read_csv('https://t.ly/H9sDJ') + +# Recode the text responses into numeric variables +sweets <- sweets %>% + mutate(Like_sweets = recode(`How much do you like sweets?`, + "I don't like them" = 0, + "I like them" = 1, + "I'm neutral" = 2 + )) %>% + mutate(Like_chocolate = recode(`How much do you like chocolate`, + "I don't like them" = 0, + "I like them" = 1, + "I'm neutral" = 2 + )) %>% + select(-`How much do you like sweets?`, -`How much do you like chocolate`) + +# Rename variables with spaces to use underscores and be shorter +sweets <- sweets %>% + rename( + ID = `ID`, + Start_time = `Start time`, + Gender = `Gender` + ) + +# Remove timestamp and email columns +sweets <- sweets %>% + select(-`Start_time`, -`Email`) + +# Check the modified dataset +glimpse(sweets) + +``` -Try using some of the markdown formatting to structure your own document. Knit the document as you make changes and note the results. +Amazingly, this actually works! + +However it's worth saying that it *often* doesn't work on first try, and it is dependent on writing a good prompt to the model, and giving it plenty of information about the data and what you want to achieve. + +Some tips: + +- Think first. Be clear about what you want to achieve. +- Explain in detail what the data are like and show examples. +- Give explicit instructions. +- If code doesn't work, ask for explanations or corrections. +- Break tasks down into smaller steps (ask for less and build up your script). +- Paste in the code you already have and ask for a completion with one extra step (e.g. something you don't know how to do). +- If you can afford it, use ChatGPT 4 or try new R-specific LLMs as they are (inevitably) released. + + + +Where there are bugs in ChatGPT's code, you can + +- fix these bugs by hand, or +- run the code, report the errors to ChatGPT and ask it to fix them for you. + +The second option has some risks (ChatGPT's code _might_ delete your data, although it's unlikely), and is probably slower. In this instance it took me quite a while to get ChatGPT to fix the simple errors, and this required understanding the code to some degree anyway. + +As I recommend in the lectures, knowing how to use R well enough to fix the small errors introduced is likely to be more +productive in the long run. Think of LLMs as your assustant rather than your replacement. + + + + +# Consolidation activity + +:::{.exercise} + +Use this example dataset: + +- Read in the data. +- Rename the long column names to something short, and without spaces +- Recode at least three of the columns with data about sleep quality to be numeric +- Save the result of this work in a new variable called `sleep.tidy` +- Pivot the recoded variables and make a boxplot of responses to these questions +- Create a summary score called `sleep_quality` which is the sum of these recoded questions (use mutate) +- Create a density plot of this summary score and interpret what you see (describe the pattern in plain English) + + +`r hide("Show complete code example")` + + +```{r, eval=F, echo=F, include=F} +sleep <- readxl::read_xlsx('data/sleep.xlsx') +sleep %>% write_csv('data/sleep.csv') +# https://gist.github.com/benwhalley/d5c168565ca770ff99b51442115f4e56 +``` + +```{r} +sleep <- read_csv('https://t.ly/_Tcjo') + +# used to check what response values are in each question +sleep %>% + select(`My sleep is affected by my study commitments`) %>% + unique() %>% paste %>% cat + +sleep %>% + select(`My electronic device usage negatively affects my sleep`) %>% + unique() %>% paste %>% cat +``` + + +```{r} +# we will use the same mapping for both questions because they have the same responses +sleep.map <- c("Agree"=2, + "Somewhat agree"=1, + "Somewhat disagree"=-1, + "Disagree"=-2, + "Neither agree nor disagree"=0, + "Strongly agree"=3, + "Strongly disagree"=-3) + + +sleep.tidy <- sleep %>% + # now we recode the two text variables (we only need use mutate once though) + mutate( + sleep_study = recode(`My sleep is affected by my study commitments`, !!!sleep.map), + sleep_electronic = recode(`My electronic device usage negatively affects my sleep`, !!!sleep.map) + ) + +``` + +```{r} +# now we can pivot longer to make a plot +sleep.tidy %>% + pivot_longer(c(sleep_study, sleep_electronic)) %>% + ggplot(aes(name, value)) + geom_boxplot() +``` + +And make a summary score combining both questions + +```{r} +sleep.tidy.withsumary <- sleep.tidy %>% + # and create the summary score + mutate(sleep_quality = sleep_study + sleep_electronic ) + +# check the result. it looks ok +sleep.tidy.withsumary %>% glimpse +``` + + +```{r} +# finally, make the requested density plot +# make the density plot +sleep.tidy.withsumary %>% + ggplot(aes(sleep_quality)) + + geom_density() +``` + + +`r unhide()` ::: + + +# Check your knowledge + +- What does it mean for data to be "tidy"? Identify the three key characteristics +- What function creates a new column in a dataset? +- Which function allows you to choose a subset of the columns in a dataset? +- What does `pivot_longer` do? +- Why is `pivot_longer` useful when we want to make a faceted plot? +- Why is wide data sometimes more useful than long data? +- If you have RT data with 10 groups, what tidyverse 'verbs' (functions) would you use to calculate the mean for each group? +- How can you read data into a dataframe from over the internet? +- Does `recode` convert from text to numeric, or from numeric to text values? +- Why is it important to recode text variables into numeric values when working with survey or questionnaire data? +- Why is it important to copy and paste exact values when making a mapping variable for `recode`? +- When renaming, is it: `rename(oldvarname=newvarname)` or `rename(newvarname=oldvarname)`? +- What does the `clean_names` function from the janitor package do? + +- Imagine a single cell of your dataset contains the string "conditionA_time_1". What function should we apply to it? + + diff --git a/website/data/expdata.csv b/website/data/expdata.csv index 938ee4f..b117f4b 100644 --- a/website/data/expdata.csv +++ b/website/data/expdata.csv @@ -1,241 +1,241 @@ Condition,stimuli,p,RT -A,S1,1,151.84776423196666 -B,S1,1,313.6829804494978 -C,S1,1,556.6156324888166 -A,S2,1,53.63154468977615 -B,S2,1,560.3667345910135 -C,S2,1,586.0361000228185 -A,S3,1,504.9853511354472 -B,S3,1,271.76747315203426 -C,S3,1,267.13421840610965 -A,S4,1,408.10440568078604 -B,S4,1,290.03985349106676 -C,S4,1,161.36798430015438 -A,S1,2,255.50130879481037 -B,S1,2,257.3931997621075 -C,S1,2,515.9351414318418 -A,S2,2,660.7350860691499 -B,S2,2,281.09785399889824 -C,S2,2,181.5313501533949 -A,S3,2,423.8490263039773 -B,S3,2,1359.8558994933608 -C,S3,2,465.87373347381686 -A,S4,2,532.0208925628708 -B,S4,2,299.82260569609025 -C,S4,2,570.4808406559827 -A,S1,3,276.88420307190347 -B,S1,3,-48.44492876856131 -C,S1,3,397.7308306604757 -A,S2,3,369.20729714264473 -B,S2,3,420.4667599437105 -C,S2,3,175.75141187452851 -A,S3,3,1140.5692723250231 -B,S3,3,290.4648722993361 -C,S3,3,230.28697475197492 -A,S4,3,528.5924453450899 -B,S4,3,31.813998392204695 -C,S4,3,123.79427205542561 -A,S1,4,-33.07974711640898 -B,S1,4,-32.1928257238967 -C,S1,4,169.9558671025615 -A,S2,4,540.0910641074329 -B,S2,4,935.486298817068 -C,S2,4,145.5190846486803 -A,S3,4,418.4080834938578 -B,S3,4,343.6480731049579 -C,S3,4,162.2916158550966 -A,S4,4,385.1035649872962 -B,S4,4,136.94607107866568 -C,S4,4,105.83310599052635 -A,S1,5,322.4438792126509 -B,S1,5,124.68076645008477 -C,S1,5,-96.69691629444854 -A,S2,5,502.64991286754605 -B,S2,5,136.60001584177485 -C,S2,5,157.59672085803123 -A,S3,5,642.5169692512858 -B,S3,5,605.3298924892844 -C,S3,5,1017.0961290982445 -A,S4,5,443.12348603998544 -B,S4,5,999.6149351552258 -C,S4,5,125.91617848378962 -A,S1,6,696.3755036856528 -B,S1,6,1117.2505470504768 -C,S1,6,232.28388001860827 -A,S2,6,475.0746400380873 -B,S2,6,422.7206552498468 -C,S2,6,1071.8143833330241 -A,S3,6,336.7726075844779 -B,S3,6,902.702396766663 -C,S3,6,887.5081684186812 -A,S4,6,825.2059089506679 -B,S4,6,427.06956193657174 -C,S4,6,295.82764232349666 -A,S1,7,366.4807655046304 -B,S1,7,419.49813206057377 -C,S1,7,917.0050288693326 -A,S2,7,645.6236901379967 -B,S2,7,77.41548442318401 -C,S2,7,226.77400804092565 -A,S3,7,795.3754033411568 -B,S3,7,333.5012188773196 -C,S3,7,372.8009463863774 -A,S4,7,639.8424598343279 -B,S4,7,81.10508301229817 -C,S4,7,373.9511201774086 -A,S1,8,767.2966802998485 -B,S1,8,434.31708026583567 -C,S1,8,390.4892275784176 -A,S2,8,560.8554384245817 -B,S2,8,368.8433630803587 -C,S2,8,118.00992290456242 -A,S3,8,680.9645562127699 -B,S3,8,503.90330026002346 -C,S3,8,1041.5604771559724 -A,S4,8,1096.6538074585874 -B,S4,8,285.1666732627405 -C,S4,8,536.2160655930027 -A,S1,9,168.34636526570085 -B,S1,9,490.07201196609515 -C,S1,9,520.2514341747806 -A,S2,9,1625.5711178250654 -B,S2,9,555.5429316923219 -C,S2,9,290.6973325891536 -A,S3,9,723.9144428172135 -B,S3,9,283.34608646662065 -C,S3,9,201.76227583502703 -A,S4,9,761.1200158149613 -B,S4,9,185.0390622158481 -C,S4,9,476.4739239830801 -A,S1,10,591.2861862244392 -B,S1,10,227.9672485038239 -C,S1,10,193.00652957671758 -A,S2,10,481.5013849501106 -B,S2,10,136.40682966429114 -C,S2,10,705.7478609631512 -A,S3,10,708.0685123599787 -B,S3,10,699.2107953185892 -C,S3,10,107.45072584883377 -A,S4,10,761.8780263836466 -B,S4,10,643.2633956494485 -C,S4,10,417.14194319197566 -A,S1,11,417.6043580491654 -B,S1,11,65.9691331811552 -C,S1,11,912.153928559884 -A,S2,11,990.4576106589036 -B,S2,11,1092.0609842102585 -C,S2,11,449.2486359883404 -A,S3,11,487.0450496736713 -B,S3,11,53.455037622349494 -C,S3,11,248.73647849831616 -A,S4,11,783.3121975089583 -B,S4,11,766.0995565793528 -C,S4,11,503.1248152278404 -A,S1,12,159.79678892053175 -B,S1,12,425.61444950259727 -C,S1,12,-76.80344415450652 -A,S2,12,573.1117138811701 -B,S2,12,333.7620100717479 -C,S2,12,-22.829958403890515 -A,S3,12,1061.6949184630173 -B,S3,12,252.10540638978785 -C,S3,12,328.80590240825734 -A,S4,12,1272.3249964237386 -B,S4,12,630.5950783158695 -C,S4,12,392.893758487725 -A,S1,13,645.7962657913253 -B,S1,13,347.3555425720548 -C,S1,13,759.5831210838531 -A,S2,13,802.061843415885 -B,S2,13,586.1090178466516 -C,S2,13,533.6141180194898 -A,S3,13,569.317543638115 -B,S3,13,454.6289030775697 -C,S3,13,1013.3008179654635 -A,S4,13,412.3865478026221 -B,S4,13,462.06171749400016 -C,S4,13,900.433595993198 -A,S1,14,404.779914839551 -B,S1,14,15.493684640864473 -C,S1,14,49.049378122538485 -A,S2,14,565.0334369004647 -B,S2,14,196.64305533270522 -C,S2,14,349.2537665078821 -A,S3,14,556.9916786650805 -B,S3,14,371.291034087816 -C,S3,14,553.0818373697568 -A,S4,14,938.9139217672862 -B,S4,14,1029.8512057667392 -C,S4,14,404.56178137994664 -A,S1,15,379.6037242346109 -B,S1,15,689.5923222015842 -C,S1,15,775.0616013191452 -A,S2,15,715.6351752049444 -B,S2,15,329.18659637097596 -C,S2,15,-2.131116256160084 -A,S3,15,1279.7542796410357 -B,S3,15,198.98086579641 -C,S3,15,133.33350313182325 -A,S4,15,2129.278106691171 -B,S4,15,497.4948530271481 -C,S4,15,415.06223783128837 -A,S1,16,-107.71164438386046 -B,S1,16,216.2112655281391 -C,S1,16,521.2549633131663 -A,S2,16,855.1591588127549 -B,S2,16,728.6605906393722 -C,S2,16,1161.7866241904717 -A,S3,16,1170.0843948763147 -B,S3,16,281.6961760061633 -C,S3,16,654.1053635944277 -A,S4,16,629.9874260618793 -B,S4,16,274.00148959318363 -C,S4,16,-138.13472665994874 -A,S1,17,252.0776995401489 -B,S1,17,372.49596031932424 -C,S1,17,956.5573432857968 -A,S2,17,889.275084624795 -B,S2,17,624.9108461825859 -C,S2,17,168.81592136839583 -A,S3,17,759.5081308692047 -B,S3,17,-42.984696467849915 -C,S3,17,579.0440685971665 -A,S4,17,968.3142825915509 -B,S4,17,481.8931295384157 -C,S4,17,652.8970432164011 -A,S1,18,576.7417032329305 -B,S1,18,453.4554233588045 -C,S1,18,827.5090335046684 -A,S2,18,1144.995262116919 -B,S2,18,434.8375968787409 -C,S2,18,135.3771843857557 -A,S3,18,856.7926204014946 -B,S3,18,310.69620562465207 -C,S3,18,953.3467100976197 -A,S4,18,214.42839095431896 -B,S4,18,306.49704726367145 -C,S4,18,558.0137903795882 -A,S1,19,429.48706375579377 -B,S1,19,101.47700923671476 -C,S1,19,167.55306809578792 -A,S2,19,940.9003611964886 -B,S2,19,722.3995775563969 -C,S2,19,638.7554954378697 -A,S3,19,1704.5282507999584 -B,S3,19,826.6247476967856 -C,S3,19,514.2928864537262 -A,S4,19,478.153442985565 -B,S4,19,1607.511686914863 -C,S4,19,644.7447534999262 -A,S1,20,279.35064863190905 -B,S1,20,289.24805963027217 -C,S1,20,163.04120476047092 -A,S2,20,603.8231281091075 -B,S2,20,176.32042335116756 -C,S2,20,460.54510554304136 -A,S3,20,818.5647631989434 -B,S3,20,143.53685611972338 -C,S3,20,-168.62782215042563 -A,S4,20,610.5837242327354 -B,S4,20,433.8867159794919 -C,S4,20,615.9959397025937 +Low,Stimulus 1,1,-10.728290804908085 +Med,Stimulus 1,1,313.6829804494978 +High,Stimulus 1,1,556.6156324888166 +Low,Stimulus 2,1,-80.6265931524629 +Med,Stimulus 2,1,560.3667345910135 +High,Stimulus 2,1,586.0361000228185 +Low,Stimulus 3,1,264.4697487435261 +Med,Stimulus 3,1,271.76747315203426 +High,Stimulus 3,1,267.13421840610965 +Low,Stimulus 4,1,186.50667542343228 +Med,Stimulus 4,1,290.03985349106676 +High,Stimulus 4,1,161.36798430015438 +Low,Stimulus 1,2,67.07654247308977 +Med,Stimulus 1,2,257.3931997621075 +High,Stimulus 1,2,515.9351414318418 +Low,Stimulus 2,2,392.3522157325964 +Med,Stimulus 2,2,281.09785399889824 +High,Stimulus 2,2,181.5313501533949 +Low,Stimulus 3,2,199.07932712011365 +Med,Stimulus 3,2,1359.8558994933608 +High,Stimulus 3,2,465.87373347381686 +Low,Stimulus 4,2,286.46204636432736 +Med,Stimulus 4,2,299.82260569609025 +High,Stimulus 4,2,570.4808406559827 +Low,Stimulus 1,3,83.50741788815196 +Med,Stimulus 1,3,-48.44492876856131 +High,Stimulus 1,3,397.7308306604757 +Low,Stimulus 2,3,155.6266405204775 +Med,Stimulus 2,3,420.4667599437105 +High,Stimulus 2,3,175.75141187452851 +Low,Stimulus 3,3,799.4314195530073 +Med,Stimulus 3,3,290.4648722993361 +High,Stimulus 3,3,230.28697475197492 +Low,Stimulus 4,3,283.6679289807756 +Med,Stimulus 4,3,31.813998392204695 +High,Stimulus 4,3,123.79427205542561 +Low,Stimulus 1,4,-137.27736817948855 +Med,Stimulus 1,4,-32.1928257238967 +High,Stimulus 1,4,169.9558671025615 +Low,Stimulus 2,4,293.0449165317996 +Med,Stimulus 2,4,935.486298817068 +High,Stimulus 2,4,145.5190846486803 +Low,Stimulus 3,4,194.7299615181082 +Med,Stimulus 3,4,343.6480731049579 +High,Stimulus 3,4,162.2916158550966 +Low,Stimulus 4,4,168.2144240596191 +Med,Stimulus 4,4,136.94607107866568 +High,Stimulus 4,4,105.83310599052635 +Low,Stimulus 1,5,118.8735663518923 +Med,Stimulus 1,5,124.68076645008477 +High,Stimulus 1,5,-96.69691629444854 +Low,Stimulus 2,5,262.57446834890607 +Med,Stimulus 2,5,136.60001584177485 +High,Stimulus 2,5,157.59672085803123 +Low,Stimulus 3,5,377.255540664695 +Med,Stimulus 3,5,605.3298924892844 +High,Stimulus 3,5,1017.0961290982445 +Low,Stimulus 4,5,214.5246914766072 +Med,Stimulus 4,5,999.6149351552258 +High,Stimulus 4,5,125.91617848378962 +Low,Stimulus 1,6,421.9801958256844 +Med,Stimulus 1,6,1117.2505470504768 +High,Stimulus 1,6,232.28388001860827 +Low,Stimulus 2,6,240.25265481780474 +Med,Stimulus 2,6,422.7206552498468 +High,Stimulus 2,6,1071.8143833330241 +Low,Stimulus 3,6,130.0890717895324 +Med,Stimulus 3,6,902.702396766663 +High,Stimulus 3,6,887.5081684186812 +Low,Stimulus 4,6,530.0175411184391 +Med,Stimulus 4,6,427.06956193657174 +High,Stimulus 4,6,295.82764232349666 +Low,Stimulus 1,7,153.4722014770664 +Med,Stimulus 1,7,419.49813206057377 +High,Stimulus 1,7,917.0050288693326 +Low,Stimulus 2,7,379.82759488601494 +Med,Stimulus 2,7,77.41548442318401 +High,Stimulus 2,7,226.77400804092565 +Low,Stimulus 3,7,504.8796976110856 +Med,Stimulus 3,7,333.5012188773196 +High,Stimulus 3,7,372.8009463863774 +Low,Stimulus 4,7,375.04210444560033 +Med,Stimulus 4,7,81.10508301229817 +High,Stimulus 4,7,373.9511201774086 +Low,Stimulus 1,8,481.28274487253543 +Med,Stimulus 1,8,434.31708026583567 +High,Stimulus 1,8,390.4892275784176 +Low,Stimulus 2,8,310.01935695143914 +Med,Stimulus 2,8,368.8433630803587 +High,Stimulus 2,8,118.00992290456242 +Low,Stimulus 3,8,409.15408533736945 +Med,Stimulus 3,8,503.90330026002346 +High,Stimulus 3,8,1041.5604771559724 +Low,Stimulus 4,8,761.5630123167806 +Med,Stimulus 4,8,285.1666732627405 +High,Stimulus 4,8,536.2160655930027 +Low,Stimulus 1,9,1.4228478777317264 +Med,Stimulus 1,9,490.07201196609515 +High,Stimulus 1,9,520.2514341747806 +Low,Stimulus 2,9,1223.304091502525 +Med,Stimulus 2,9,555.5429316923219 +High,Stimulus 2,9,290.6973325891536 +Low,Stimulus 3,9,444.9548331949459 +Med,Stimulus 3,9,283.34608646662065 +High,Stimulus 3,9,201.76227583502703 +Low,Stimulus 4,9,476.10065902978226 +Med,Stimulus 4,9,185.0390622158481 +High,Stimulus 4,9,476.4739239830801 +Low,Stimulus 1,10,334.988090563548 +Med,Stimulus 1,10,227.9672485038239 +High,Stimulus 1,10,193.00652957671758 +Low,Stimulus 2,10,245.44557004645617 +Med,Stimulus 2,10,136.40682966429114 +High,Stimulus 2,10,705.7478609631512 +Low,Stimulus 3,10,431.72676079541725 +Med,Stimulus 3,10,699.2107953185892 +High,Stimulus 3,10,107.45072584883377 +Low,Stimulus 4,10,476.73644155767613 +Med,Stimulus 4,10,643.2633956494485 +High,Stimulus 4,10,417.14194319197566 +Low,Stimulus 1,11,194.08788816252684 +Med,Stimulus 1,11,65.9691331811552 +High,Stimulus 1,11,912.153928559884 +Low,Stimulus 2,11,670.4275234695538 +Med,Stimulus 2,11,1092.0609842102585 +High,Stimulus 2,11,449.2486359883404 +Low,Stimulus 3,11,249.9296076116533 +Med,Stimulus 3,11,53.455037622349494 +High,Stimulus 3,11,248.73647849831616 +Low,Stimulus 4,11,494.7341055591286 +Med,Stimulus 4,11,766.0995565793528 +High,Stimulus 4,11,503.1248152278404 +Low,Stimulus 1,12,-4.886312309001056 +Med,Stimulus 1,12,425.61444950259727 +High,Stimulus 1,12,-76.80344415450652 +Low,Stimulus 2,12,320.0628487726841 +Med,Stimulus 2,12,333.7620100717479 +High,Stimulus 2,12,-22.829958403890515 +Low,Stimulus 3,12,731.4914671195629 +Med,Stimulus 3,12,252.10540638978785 +High,Stimulus 3,12,328.80590240825734 +Low,Stimulus 4,12,913.6161228719614 +Med,Stimulus 4,12,630.5950783158695 +High,Stimulus 4,12,392.893758487725 +Low,Stimulus 1,13,379.9704990870449 +Med,Stimulus 1,13,347.3555425720548 +High,Stimulus 1,13,759.5831210838531 +Low,Stimulus 2,13,510.5082392007562 +Med,Stimulus 2,13,586.1090178466516 +High,Stimulus 2,13,533.6141180194898 +Low,Stimulus 3,13,316.9518024878938 +Med,Stimulus 3,13,454.6289030775697 +High,Stimulus 3,13,1013.3008179654635 +Low,Stimulus 4,13,189.9220962539962 +Med,Stimulus 4,13,462.06171749400016 +High,Stimulus 4,13,900.433595993198 +Low,Stimulus 1,14,183.8571800549136 +Med,Stimulus 1,14,15.493684640864473 +High,Stimulus 1,14,49.049378122538485 +Low,Stimulus 2,14,313.44105867928795 +Med,Stimulus 2,14,196.64305533270522 +High,Stimulus 2,14,349.2537665078821 +Low,Stimulus 3,14,306.85686104559403 +Med,Stimulus 3,14,371.291034087816 +High,Stimulus 3,14,553.0818373697568 +Low,Stimulus 4,14,626.4359340751574 +Med,Stimulus 4,14,1029.8512057667392 +High,Stimulus 4,14,404.56178137994664 +Low,Stimulus 1,15,163.8541189941016 +Med,Stimulus 1,15,689.5923222015842 +High,Stimulus 1,15,775.0616013191452 +Low,Stimulus 2,15,438.04053295149265 +Med,Stimulus 2,15,329.18659637097596 +High,Stimulus 2,15,-2.131116256160084 +Low,Stimulus 3,15,920.078535716229 +Med,Stimulus 3,15,198.98086579641 +High,Stimulus 3,15,133.33350313182325 +Low,Stimulus 4,15,1671.652153525454 +Med,Stimulus 4,15,497.4948530271481 +High,Stimulus 4,15,415.06223783128837 +Low,Stimulus 1,16,-178.77847300651229 +Med,Stimulus 1,16,216.2112655281391 +High,Stimulus 1,16,521.2549633131663 +Low,Stimulus 2,16,555.3270238856892 +Med,Stimulus 2,16,728.6605906393722 +High,Stimulus 2,16,1161.7866241904717 +Low,Stimulus 3,16,824.9378840354971 +Med,Stimulus 3,16,281.6961760061633 +High,Stimulus 3,16,654.1053635944277 +Low,Stimulus 4,16,366.89240249120405 +Med,Stimulus 4,16,274.00148959318363 +High,Stimulus 4,16,-138.13472665994874 +Low,Stimulus 1,17,64.45651067597458 +Med,Stimulus 1,17,372.49596031932424 +High,Stimulus 1,17,956.5573432857968 +Low,Stimulus 2,17,584.2334077679479 +Med,Stimulus 2,17,624.9108461825859 +High,Stimulus 2,17,168.81592136839583 +Low,Stimulus 3,17,474.7488483821088 +Med,Stimulus 3,17,-42.984696467849915 +High,Stimulus 3,17,579.0440685971665 +Low,Stimulus 4,17,651.508155877656 +Med,Stimulus 4,17,481.8931295384157 +High,Stimulus 4,17,652.8970432164011 +Low,Stimulus 1,18,323.040841513249 +Med,Stimulus 1,18,453.4554233588045 +High,Stimulus 1,18,827.5090335046684 +Low,Stimulus 2,18,803.2534916411635 +Med,Stimulus 2,18,434.8375968787409 +High,Stimulus 2,18,135.3771843857557 +Low,Stimulus 3,18,556.7091510583898 +Med,Stimulus 3,18,310.69620562465207 +High,Stimulus 3,18,953.3467100976197 +Low,Stimulus 4,18,35.85324734530195 +Med,Stimulus 4,18,306.49704726367145 +High,Stimulus 4,18,558.0137903795882 +Low,Stimulus 1,19,203.59125594789242 +Med,Stimulus 1,19,101.47700923671476 +High,Stimulus 1,19,167.55306809578792 +Low,Stimulus 2,19,628.1281952034205 +Med,Stimulus 2,19,722.3995775563969 +High,Stimulus 2,19,638.7554954378697 +Low,Stimulus 3,19,1293.1192395731096 +Med,Stimulus 3,19,826.6247476967856 +High,Stimulus 3,19,514.2928864537262 +Low,Stimulus 4,19,242.73964861484336 +Med,Stimulus 4,19,1607.511686914863 +High,Stimulus 4,19,644.7447534999262 +Low,Stimulus 1,20,85.40986928006947 +Med,Stimulus 1,20,289.24805963027217 +High,Stimulus 1,20,163.04120476047092 +Low,Stimulus 2,20,345.3053812026559 +Med,Stimulus 2,20,176.32042335116756 +High,Stimulus 2,20,460.54510554304136 +Low,Stimulus 3,20,524.4151488800563 +Med,Stimulus 3,20,143.53685611972338 +High,Stimulus 3,20,-168.62782215042563 +Low,Stimulus 4,20,350.8762033920075 +Med,Stimulus 4,20,433.8867159794919 +High,Stimulus 4,20,615.9959397025937 diff --git a/website/data/sleep.csv b/website/data/sleep.csv new file mode 100644 index 0000000..f6b3bf8 --- /dev/null +++ b/website/data/sleep.csv @@ -0,0 +1,242 @@ +uniqueid,Start time,Completion time,My sleep is affected by my study commitments,I achieve good quality sleep,My electronic device usage negatively affects my sleep,Tiredness interferes with my concentration,"My sleep is disturbed by external factors e.g. loud cars, housemates, lights, children...",I often achieve eight hours of sleep,I regularly stay up past 11pm +60120734ec,2020-11-27T14:21:20Z,2020-11-27T14:22:47Z,Agree,Agree,Disagree,Strongly agree,Strongly agree,Disagree,Disagree +d0a92dec4b,2020-11-26T11:15:36Z,2020-11-26T11:16:37Z,Somewhat agree,Agree,Strongly agree,Somewhat agree,Disagree,Somewhat agree,Disagree +10eb3ee84d,2020-11-27T09:14:42Z,2020-11-27T09:18:33Z,Somewhat disagree,Somewhat agree,Somewhat disagree,Somewhat agree,Somewhat disagree,Somewhat agree,Somewhat agree +382eb4881a,2020-11-27T14:20:08Z,2020-11-27T14:22:10Z,Agree,Somewhat agree,Agree,Strongly agree,Somewhat agree,Neither agree nor disagree,Somewhat agree +371af06767,2020-11-27T14:24:28Z,2020-11-27T14:27:05Z,Somewhat agree,Somewhat disagree,Agree,Strongly agree,Agree,Disagree,Agree +3d112ce80d,2020-11-27T02:06:07Z,2020-11-27T02:06:53Z,Disagree,Strongly disagree,Somewhat agree,Disagree,Strongly disagree,Strongly disagree,Somewhat disagree +694d608e96,2020-11-27T11:11:28Z,2020-11-27T11:12:57Z,Neither agree nor disagree,Somewhat disagree,Agree,Strongly agree,Somewhat agree,Somewhat disagree,Strongly agree +4a40175756,2020-11-27T14:20:20Z,2020-11-27T14:20:58Z,Somewhat agree,Somewhat agree,Agree,Somewhat agree,Disagree,Somewhat agree,Strongly agree +de2cc7f322,2020-11-27T14:19:42Z,2020-11-27T14:22:50Z,Neither agree nor disagree,Agree,Neither agree nor disagree,Agree,Somewhat disagree,Strongly agree,Somewhat disagree +b3e6fe9098,2020-11-26T11:12:04Z,2020-11-26T11:14:49Z,Somewhat agree,Somewhat disagree,Somewhat agree,Agree,Agree,Somewhat agree,Strongly agree +3d66d6122d,2020-11-26T15:22:21Z,2020-11-26T15:23:19Z,Strongly agree,Disagree,Strongly agree,Strongly agree,Strongly agree,Disagree,Strongly agree +89e475a1e2,2020-11-26T11:19:57Z,2020-11-26T11:23:06Z,Strongly agree,Somewhat agree,Disagree,Strongly agree,Disagree,Strongly disagree,Strongly agree +b43f2f9149,2020-11-26T11:14:14Z,2020-11-26T11:15:43Z,Somewhat disagree,Neither agree nor disagree,Agree,Agree,Somewhat agree,Agree,Somewhat agree +ee6a502316,2020-11-26T11:20:26Z,2020-11-26T11:21:10Z,Somewhat agree,Disagree,Agree,Agree,Somewhat agree,Disagree,Strongly agree +cc8082ca43,2020-11-26T14:13:29Z,2020-11-26T14:15:01Z,Somewhat agree,Agree,Neither agree nor disagree,Neither agree nor disagree,Somewhat agree,Disagree,Agree +5a75e35a9e,2020-11-26T11:07:38Z,2020-11-26T11:10:50Z,Strongly agree,Somewhat disagree,Agree,Strongly agree,Strongly agree,Somewhat agree,Strongly agree +1497ffd89f,2020-11-27T11:13:17Z,2020-11-27T11:14:47Z,Somewhat agree,Somewhat disagree,Somewhat agree,Agree,Disagree,Somewhat agree,Strongly agree +abe35ed396,2020-11-26T11:17:10Z,2020-11-26T11:17:52Z,Agree,Disagree,Strongly agree,Agree,Somewhat disagree,Strongly agree,Strongly agree +880f9af842,2020-11-26T09:20:30Z,2020-11-26T09:21:16Z,Disagree,Strongly disagree,Somewhat disagree,Strongly agree,Somewhat agree,Strongly disagree,Strongly agree +01ab2f96e7,2020-11-27T14:13:54Z,2020-11-27T14:16:41Z,Agree,Disagree,Somewhat agree,Strongly agree,Agree,Strongly disagree,Strongly agree +a958509686,2020-11-27T09:18:04Z,2020-11-27T09:19:15Z,Somewhat agree,Somewhat agree,Somewhat agree,Strongly agree,Somewhat agree,Neither agree nor disagree,Agree +b6d4021303,2020-11-26T13:26:36Z,2020-11-26T13:28:07Z,Strongly disagree,Agree,Somewhat disagree,Strongly agree,Somewhat disagree,Strongly agree,Strongly agree +c61ef74d08,2020-11-27T11:12:13Z,2020-11-27T11:12:54Z,Somewhat agree,Disagree,Somewhat agree,Agree,Somewhat agree,Somewhat disagree,Agree +1f6128ed75,2020-11-26T11:09:59Z,2020-11-26T11:10:47Z,Somewhat agree,Somewhat agree,Somewhat agree,Agree,Agree,Agree,Somewhat disagree +fe8954f04f,2020-11-27T09:15:03Z,2020-11-27T09:15:57Z,Neither agree nor disagree,Agree,Disagree,Somewhat agree,Strongly disagree,Strongly agree,Somewhat disagree +b5674bb470,2020-11-26T09:11:53Z,2020-11-26T09:12:46Z,Strongly disagree,Agree,Disagree,Disagree,Disagree,Agree,Agree +ec4c85788b,2020-11-27T11:17:52Z,2020-11-27T11:19:57Z,Neither agree nor disagree,Disagree,Strongly agree,Agree,Neither agree nor disagree,Strongly disagree,Strongly agree +ffb0b16cbe,2020-11-27T09:51:32Z,2020-11-27T09:53:17Z,Agree,Somewhat disagree,Agree,Somewhat agree,Disagree,Strongly disagree,Strongly agree +5a8f333321,2020-11-27T14:16:01Z,2020-11-27T14:18:23Z,Somewhat agree,Somewhat agree,Strongly agree,Strongly agree,Somewhat disagree,Somewhat agree,Strongly agree +72f7717b51,2020-11-27T09:13:54Z,2020-11-27T09:19:26Z,Agree,Neither agree nor disagree,Somewhat agree,Strongly agree,Somewhat disagree,Agree,Agree +83b7a9c75e,2020-11-27T14:14:23Z,2020-11-27T14:14:52Z,Strongly disagree,Agree,Somewhat agree,Neither agree nor disagree,Strongly disagree,Agree,Somewhat agree +7aa074c54a,2020-11-26T11:18:17Z,2020-11-26T11:19:23Z,Somewhat agree,Disagree,Somewhat agree,Strongly agree,Somewhat agree,Somewhat agree,Somewhat agree +0ceae554d5,2020-11-26T14:12:38Z,2020-11-26T14:14:07Z,Somewhat agree,Disagree,Somewhat agree,Agree,Agree,Strongly disagree,Strongly disagree +22ab7f269c,2020-11-27T14:14:31Z,2020-11-27T14:17:39Z,Agree,Somewhat disagree,Strongly agree,Agree,Somewhat agree,Neither agree nor disagree,Strongly agree +41fbce2132,2020-11-27T11:23:36Z,2020-11-27T11:25:01Z,Somewhat agree,Somewhat disagree,Somewhat agree,Agree,Agree,Disagree,Disagree +31be5d26b8,2020-11-26T14:34:53Z,2020-11-26T14:35:33Z,Strongly agree,Somewhat disagree,Strongly disagree,Neither agree nor disagree,Strongly agree,Strongly disagree,Strongly agree +d0d7536654,2020-11-26T09:16:55Z,2020-11-26T09:17:47Z,Agree,Strongly disagree,Somewhat agree,Agree,Agree,Disagree,Agree +5a4413f820,2020-11-27T09:29:36Z,2020-11-27T09:30:54Z,Agree,Somewhat agree,Neither agree nor disagree,Somewhat agree,Disagree,Agree,Strongly agree +3e34bfcf47,2020-11-27T09:11:33Z,2020-11-27T09:14:15Z,Somewhat agree,Somewhat agree,Strongly agree,Agree,Somewhat disagree,Somewhat agree,Somewhat agree +4b692f18cb,2020-11-30T08:54:50Z,2020-11-30T08:55:46Z,Somewhat agree,Somewhat agree,Disagree,Somewhat agree,Disagree,Neither agree nor disagree,Somewhat agree +779cabdb52,2020-11-27T09:26:48Z,2020-11-27T09:27:29Z,Somewhat agree,Agree,Disagree,Agree,Strongly disagree,Somewhat disagree,Strongly agree +38797bc61b,2020-11-26T11:10:30Z,2020-11-26T11:11:21Z,Agree,Somewhat disagree,Agree,Somewhat disagree,Agree,Somewhat disagree,Strongly agree +1ac4dfd50a,2020-11-27T11:18:24Z,2020-11-27T11:20:12Z,Somewhat disagree,Agree,Somewhat disagree,Agree,Disagree,Agree,Agree +ae87bc5f9b,2020-11-27T09:15:34Z,2020-11-27T09:17:32Z,Agree,Somewhat disagree,Agree,Agree,Agree,Somewhat disagree,Agree +5ce22a3cc6,2020-11-27T09:27:30Z,2020-11-27T09:28:10Z,Agree,Disagree,Strongly agree,Strongly agree,Disagree,Disagree,Strongly agree +47e7fbe2a2,2020-11-27T14:11:19Z,2020-11-27T14:11:57Z,Somewhat agree,Agree,Somewhat agree,Agree,Disagree,Agree,Agree +5b455d550e,2020-11-27T09:23:03Z,2020-11-27T09:24:21Z,Agree,Agree,Somewhat agree,Strongly agree,Agree,Somewhat disagree,Disagree +c985e461b8,2020-11-26T11:46:43Z,2020-11-26T11:47:21Z,Somewhat agree,Agree,Strongly agree,Strongly agree,Somewhat disagree,Disagree,Strongly agree +389739a6a5,2020-11-27T09:15:58Z,2020-11-27T09:16:57Z,Somewhat agree,Somewhat agree,Agree,Agree,Somewhat disagree,Somewhat agree,Somewhat agree +00b905802d,2020-11-26T14:16:18Z,2020-11-26T14:18:37Z,Strongly agree,Somewhat disagree,Agree,Strongly agree,Strongly agree,Disagree,Strongly agree +c00b02da26,2020-11-26T14:09:22Z,2020-11-26T14:10:09Z,Strongly agree,Disagree,Somewhat agree,Strongly agree,Somewhat disagree,Strongly disagree,Agree +6d2d3b0a1a,2020-11-27T11:13:09Z,2020-11-27T11:14:44Z,Somewhat agree,Somewhat disagree,Neither agree nor disagree,Somewhat agree,Somewhat agree,Somewhat agree,Somewhat disagree +aac1327383,2020-11-27T14:10:58Z,2020-11-27T14:13:40Z,Strongly agree,Strongly disagree,Somewhat disagree,Strongly agree,Strongly agree,Strongly disagree,Strongly agree +da9eff6594,2020-11-27T09:13:24Z,2020-11-27T09:14:15Z,Strongly agree,Somewhat agree,Agree,Strongly agree,Somewhat agree,Neither agree nor disagree,Strongly agree +c51ca10f04,2020-11-26T09:12:52Z,2020-11-26T09:13:32Z,Strongly agree,Somewhat disagree,Agree,Agree,Strongly agree,Agree,Agree +346259afdc,2020-11-27T11:16:52Z,2020-11-27T11:18:26Z,Agree,Disagree,Neither agree nor disagree,Agree,Somewhat agree,Agree,Strongly agree +8455e99d5e,2020-11-27T11:14:40Z,2020-11-27T11:16:06Z,Somewhat agree,Somewhat agree,Disagree,Agree,Somewhat agree,Somewhat agree,Strongly agree +8ed53e632f,2020-11-26T09:18:04Z,2020-11-26T09:19:53Z,Neither agree nor disagree,Somewhat disagree,Somewhat disagree,Somewhat agree,Somewhat disagree,Somewhat disagree,Somewhat disagree +660998aef1,2020-11-27T09:25:07Z,2020-11-27T09:28:30Z,Somewhat agree,Somewhat agree,Agree,Agree,Somewhat agree,Somewhat agree,Strongly agree +19652d1ba0,2020-11-26T10:20:30Z,2020-11-26T10:23:35Z,Neither agree nor disagree,Disagree,Strongly agree,Agree,Disagree,Strongly disagree,Strongly disagree +8560dbdc32,2020-11-26T09:22:37Z,2020-11-26T09:24:18Z,Strongly disagree,Somewhat agree,Somewhat disagree,Agree,Somewhat disagree,Disagree,Strongly agree +c51881d0ab,2020-11-27T14:17:29Z,2020-11-27T14:18:39Z,Strongly agree,Strongly disagree,Somewhat agree,Agree,Somewhat agree,Strongly disagree,Strongly agree +dbb832691b,2020-11-27T11:11:04Z,2020-11-27T11:13:16Z,Strongly disagree,Strongly agree,Disagree,Somewhat agree,Somewhat disagree,Strongly agree,Agree +ccba950bd7,2020-11-26T14:18:07Z,2020-11-26T14:19:49Z,Strongly agree,Somewhat disagree,Neither agree nor disagree,Agree,Disagree,Strongly disagree,Neither agree nor disagree +a4614d86ae,2020-11-26T11:10:37Z,2020-11-26T11:11:21Z,Somewhat agree,Somewhat agree,Neither agree nor disagree,Strongly agree,Agree,Somewhat disagree,Strongly agree +b928cfeeb0,2020-11-26T14:21:21Z,2020-11-26T14:22:49Z,Agree,Somewhat agree,Agree,Agree,Disagree,Somewhat agree,Strongly agree +b1f61d8295,2020-11-27T09:21:10Z,2020-11-27T09:22:13Z,Somewhat agree,Neither agree nor disagree,Somewhat agree,Agree,Neither agree nor disagree,Somewhat disagree,Neither agree nor disagree +5617e7dd7b,2020-11-27T09:20:52Z,2020-11-27T09:21:53Z,Agree,Strongly disagree,Agree,Strongly agree,Somewhat agree,Somewhat agree,Strongly agree +28f2b28f2f,2020-11-27T09:16:07Z,2020-11-27T09:17:20Z,Somewhat agree,Strongly disagree,Strongly agree,Strongly agree,Somewhat agree,Disagree,Strongly agree +d857a1a885,2020-11-26T09:22:41Z,2020-11-26T09:24:13Z,Somewhat agree,Neither agree nor disagree,Strongly agree,Strongly agree,Neither agree nor disagree,Somewhat agree,Agree +f00c5ca58c,2020-11-26T10:25:40Z,2020-11-26T10:26:14Z,Somewhat agree,Somewhat disagree,Neither agree nor disagree,Somewhat agree,Somewhat agree,Disagree,Somewhat agree +2722e61e84,2020-11-26T14:15:43Z,2020-11-26T14:19:17Z,Somewhat agree,Disagree,Neither agree nor disagree,Agree,Neither agree nor disagree,Neither agree nor disagree,Agree +4f849bdcaa,2020-11-26T11:13:53Z,2020-11-26T11:14:57Z,Strongly disagree,Strongly agree,Strongly disagree,Strongly agree,Disagree,Strongly agree,Strongly agree +809b462b55,2020-11-26T14:14:40Z,2020-11-26T14:15:44Z,Disagree,Agree,Neither agree nor disagree,Strongly disagree,Strongly disagree,Disagree,Disagree +204ba22692,2020-11-26T11:08:44Z,2020-11-26T11:09:39Z,Agree,Disagree,Agree,Agree,Disagree,Somewhat agree,Strongly agree +e1493f47f7,2020-11-26T11:16:34Z,2020-11-26T11:18:01Z,Agree,Somewhat agree,Strongly agree,Strongly agree,Strongly agree,Somewhat agree,Disagree +50a3afb7c1,2020-11-27T09:14:16Z,2020-11-27T09:15:06Z,Somewhat agree,Agree,Agree,Strongly agree,Agree,Somewhat agree,Somewhat agree +6508b073f4,2020-11-26T11:15:10Z,2020-11-26T11:15:55Z,Disagree,Agree,Disagree,Somewhat disagree,Strongly disagree,Agree,Somewhat agree +4c318c100e,2020-11-26T14:59:43Z,2020-11-26T15:00:36Z,Strongly agree,Strongly disagree,Neither agree nor disagree,Strongly agree,Disagree,Strongly disagree,Somewhat agree +762f30ad94,2020-11-27T15:24:35Z,2020-11-27T15:25:34Z,Strongly agree,Disagree,Agree,Agree,Somewhat agree,Disagree,Strongly agree +173cc17873,2020-11-27T14:15:35Z,2020-11-27T14:16:11Z,Agree,Disagree,Agree,Strongly agree,Agree,Disagree,Strongly agree +d12d79c755,2020-11-29T22:26:19Z,2020-11-29T22:27:09Z,Strongly agree,Disagree,Somewhat disagree,Somewhat agree,Agree,Strongly disagree,Strongly agree +e828daa59d,2020-11-27T14:10:38Z,2020-11-27T14:11:39Z,Strongly agree,Neither agree nor disagree,Somewhat agree,Strongly agree,Somewhat agree,Neither agree nor disagree,Strongly agree +a02e551405,2020-11-27T11:12:55Z,2020-11-27T11:13:47Z,Somewhat agree,Agree,Somewhat agree,Neither agree nor disagree,Disagree,Agree,Agree +b2af77796b,2020-11-26T11:11:05Z,2020-11-26T11:11:40Z,Somewhat disagree,Agree,Somewhat disagree,Somewhat agree,Somewhat disagree,Agree,Agree +506dd240a9,2020-11-27T11:15:11Z,2020-11-27T11:15:52Z,Disagree,Somewhat disagree,Disagree,Agree,Strongly disagree,Disagree,Somewhat agree +61e59ca80b,2020-11-27T09:33:50Z,2020-11-27T09:35:46Z,Somewhat agree,Agree,Somewhat agree,Agree,Disagree,Strongly agree,Strongly agree +e57529cd7c,2020-11-27T11:11:38Z,2020-11-27T11:12:25Z,Agree,Somewhat agree,Somewhat agree,Agree,Strongly agree,Agree,Strongly agree +6292443105,2020-11-26T11:30:19Z,2020-11-26T11:31:48Z,Somewhat disagree,Agree,Somewhat agree,Agree,Disagree,Strongly agree,Disagree +ec952be578,2020-11-27T11:27:14Z,2020-11-27T11:27:47Z,Agree,Somewhat agree,Somewhat agree,Strongly agree,Somewhat disagree,Agree,Agree +b835621afe,2020-11-26T14:23:02Z,2020-11-26T14:25:02Z,Agree,Somewhat agree,Agree,Strongly agree,Somewhat agree,Agree,Strongly agree +3c77acee06,2020-11-26T11:21:46Z,2020-11-26T11:24:43Z,Somewhat agree,Somewhat disagree,Agree,Strongly agree,Somewhat agree,Somewhat agree,Agree +92ef237278,2020-11-26T14:14:30Z,2020-11-26T14:15:52Z,Somewhat agree,Agree,Somewhat disagree,Somewhat agree,Disagree,Agree,Disagree +b990042619,2020-11-26T09:18:15Z,2020-11-26T09:19:39Z,Somewhat agree,Strongly disagree,Agree,Agree,Disagree,Strongly disagree,Strongly agree +455e86c920,2020-11-27T14:13:37Z,2020-11-27T14:14:09Z,Somewhat disagree,Somewhat agree,Strongly agree,Strongly agree,Strongly agree,Neither agree nor disagree,Strongly agree +c0c288d8a3,2020-11-26T14:14:53Z,2020-11-26T14:15:46Z,Neither agree nor disagree,Agree,Neither agree nor disagree,Somewhat agree,Neither agree nor disagree,Agree,Somewhat agree +455c8db493,2020-11-27T11:11:47Z,2020-11-27T11:13:46Z,Somewhat agree,Neither agree nor disagree,Somewhat agree,Agree,Agree,Disagree,Strongly disagree +879d9de58d,2020-11-26T11:14:49Z,2020-11-26T11:17:21Z,Somewhat agree,Somewhat agree,Somewhat agree,Neither agree nor disagree,Somewhat disagree,Strongly agree,Somewhat agree +25e3cebb9e,2020-11-26T11:11:23Z,2020-11-26T11:12:49Z,Agree,Disagree,Strongly disagree,Strongly agree,Disagree,Disagree,Strongly agree +bbb836a4a8,2020-11-26T14:22:21Z,2020-11-26T14:23:33Z,Agree,Disagree,Disagree,Strongly agree,Somewhat agree,Disagree,Strongly agree +1654823600,2020-11-26T09:24:30Z,2020-11-26T09:25:17Z,Somewhat agree,Somewhat agree,Strongly agree,Agree,Disagree,Agree,Strongly agree +80cff3a78b,2020-11-26T11:02:26Z,2020-11-26T11:03:23Z,Somewhat agree,Somewhat disagree,Agree,Agree,Somewhat agree,Disagree,Strongly agree +d7f029014a,2020-11-26T14:14:18Z,2020-11-26T14:14:59Z,Somewhat agree,Strongly disagree,Disagree,Strongly agree,Agree,Strongly disagree,Agree +38a8142899,2020-11-27T09:14:36Z,2020-11-27T09:15:53Z,Somewhat agree,Neither agree nor disagree,Somewhat agree,Agree,Agree,Disagree,Agree +175f72c3ee,2020-11-27T12:34:10Z,2020-11-27T12:34:43Z,Strongly agree,Somewhat disagree,Agree,Strongly agree,Somewhat disagree,Somewhat agree,Strongly agree +82b9a31f8b,2020-11-26T09:31:12Z,2020-11-26T10:23:12Z,Neither agree nor disagree,Somewhat agree,Somewhat disagree,Agree,Agree,Neither agree nor disagree,Strongly agree +87cbc06116,2020-11-27T09:22:09Z,2020-11-27T09:23:56Z,Somewhat agree,Somewhat disagree,Somewhat agree,Agree,Neither agree nor disagree,Neither agree nor disagree,Strongly agree +de4a8fcf2a,2020-11-27T11:24:42Z,2020-11-27T11:25:14Z,Strongly agree,Disagree,Agree,Agree,Strongly disagree,Strongly disagree,Strongly agree +f4c1673354,2020-11-26T09:15:39Z,2020-11-26T09:17:16Z,Somewhat agree,Agree,Agree,Agree,Somewhat agree,Somewhat agree,Strongly agree +fc63dcf06d,2020-11-26T14:19:42Z,2020-11-26T14:21:25Z,Strongly agree,Disagree,Somewhat agree,Strongly agree,Strongly agree,Strongly disagree,Neither agree nor disagree +1e54183579,2020-11-27T11:14:09Z,2020-11-27T11:15:24Z,Somewhat agree,Neither agree nor disagree,Agree,Agree,Somewhat disagree,Agree,Strongly agree +8ae9190823,2020-11-26T09:18:41Z,2020-11-26T09:19:36Z,Somewhat agree,Somewhat disagree,Strongly agree,Agree,Agree,Somewhat agree,Strongly agree +243c012b3f,2020-11-26T09:11:12Z,2020-11-26T09:15:00Z,Somewhat agree,Somewhat agree,Neither agree nor disagree,Strongly agree,Agree,Strongly agree,Agree +6574f6fdd2,2020-11-26T14:15:17Z,2020-11-26T14:16:52Z,Somewhat agree,Agree,Disagree,Agree,Somewhat agree,Somewhat agree,Somewhat agree +5d4bcd90f1,2020-11-27T11:18:33Z,2020-11-27T11:21:11Z,Somewhat agree,Somewhat agree,Agree,Strongly agree,Disagree,Agree,Disagree +13416e1140,2020-11-26T11:12:40Z,2020-11-26T11:13:47Z,Somewhat agree,Neither agree nor disagree,Somewhat agree,Agree,Neither agree nor disagree,Neither agree nor disagree,Strongly agree +82f99a6b8e,2020-11-26T09:16:31Z,2020-11-26T09:18:05Z,Strongly agree,Strongly agree,Disagree,Somewhat agree,Somewhat agree,Strongly agree,Disagree +222af9fd30,2020-11-27T14:13:04Z,2020-11-27T14:15:33Z,Agree,Disagree,Agree,Agree,Disagree,Agree,Neither agree nor disagree +3245b5a2b5,2020-11-26T09:30:14Z,2020-11-26T09:31:03Z,Disagree,Disagree,Somewhat agree,Agree,Neither agree nor disagree,Neither agree nor disagree,Agree +8a74b533e8,2020-11-26T09:15:32Z,2020-11-26T09:16:23Z,Somewhat agree,Somewhat agree,Somewhat agree,Neither agree nor disagree,Somewhat disagree,Neither agree nor disagree,Somewhat agree +3887ca7da6,2020-11-26T09:14:07Z,2020-11-26T09:15:05Z,Agree,Disagree,Agree,Strongly agree,Somewhat disagree,Disagree,Strongly agree +fd03ccfcc2,2020-11-26T09:15:06Z,2020-11-26T09:16:07Z,Disagree,Agree,Somewhat agree,Disagree,Strongly disagree,Agree,Strongly agree +8c51b6eb65,2020-11-27T09:19:38Z,2020-11-27T09:21:05Z,Strongly agree,Agree,Strongly agree,Strongly agree,Agree,Agree,Strongly agree +3f92133e4a,2020-11-27T09:18:21Z,2020-11-27T09:19:22Z,Agree,Strongly disagree,Agree,Agree,Agree,Disagree,Strongly agree +06e5c36a6a,2020-11-27T11:12:51Z,2020-11-27T11:14:16Z,Somewhat agree,Somewhat disagree,Agree,Somewhat agree,Somewhat disagree,Somewhat agree,Strongly agree +c6eb3ebb5c,2020-11-26T11:21:05Z,2020-11-26T11:23:18Z,Strongly agree,Agree,Neither agree nor disagree,Strongly agree,Disagree,Agree,Somewhat agree +9b479a21db,2020-11-26T09:14:04Z,2020-11-26T09:17:17Z,Somewhat agree,Somewhat agree,Disagree,Somewhat agree,Agree,Agree,Agree +6689a8f778,2020-11-26T14:23:09Z,2020-11-26T14:23:45Z,Agree,Strongly disagree,Somewhat agree,Somewhat agree,Somewhat agree,Strongly disagree,Strongly agree +426a13a83d,2020-11-27T11:12:48Z,2020-11-27T11:14:03Z,Somewhat agree,Strongly agree,Disagree,Strongly agree,Agree,Strongly agree,Disagree +c12f3f6d8e,2020-11-27T14:18:37Z,2020-11-27T14:19:38Z,Somewhat agree,Somewhat agree,Somewhat disagree,Agree,Somewhat agree,Agree,Somewhat agree +259a551661,2020-11-27T11:15:12Z,2020-11-27T11:16:08Z,Somewhat disagree,Agree,Somewhat agree,Strongly agree,Disagree,Agree,Strongly agree +a11c1edbff,2020-11-27T11:10:55Z,2020-11-27T11:12:33Z,Somewhat agree,Somewhat disagree,Somewhat agree,Agree,Disagree,Disagree,Strongly agree +5cf5ec229c,2020-11-27T09:12:57Z,2020-11-27T09:13:41Z,Agree,Disagree,Somewhat agree,Somewhat agree,Somewhat disagree,Strongly disagree,Strongly agree +75019cc3d6,2020-11-27T11:18:28Z,2020-11-27T11:20:57Z,Somewhat agree,Strongly disagree,Somewhat agree,Somewhat agree,Somewhat agree,Disagree,Strongly agree +bd8c05d34d,2020-11-26T08:14:43Z,2020-11-26T08:15:30Z,Strongly agree,Disagree,Disagree,Strongly agree,Strongly agree,Strongly disagree,Agree +cfd173037f,2020-11-27T09:12:55Z,2020-11-27T09:13:26Z,Somewhat agree,Agree,Agree,Strongly agree,Neither agree nor disagree,Agree,Somewhat agree +2a055735c4,2020-11-27T11:17:14Z,2020-11-27T11:19:11Z,Strongly agree,Somewhat disagree,Somewhat agree,Agree,Agree,Somewhat agree,Somewhat agree +11813b71d3,2020-11-27T14:12:29Z,2020-11-27T14:13:01Z,Agree,Somewhat agree,Disagree,Agree,Strongly agree,Somewhat agree,Agree +120b3beeaf,2020-11-27T11:14:14Z,2020-11-27T11:15:09Z,Somewhat agree,Somewhat agree,Neither agree nor disagree,Somewhat agree,Agree,Somewhat agree,Agree +301417186c,2020-11-26T09:14:23Z,2020-11-26T09:15:30Z,Disagree,Somewhat agree,Disagree,Somewhat agree,Somewhat agree,Agree,Agree +e374dcb5c5,2020-11-27T11:12:02Z,2020-11-27T11:13:20Z,Agree,Neither agree nor disagree,Somewhat disagree,Somewhat disagree,Somewhat agree,Disagree,Somewhat agree +e21ac9f5a5,2020-11-27T11:14:42Z,2020-11-27T11:15:22Z,Somewhat agree,Strongly agree,Somewhat agree,Strongly agree,Somewhat disagree,Strongly agree,Agree +0e28c3c4be,2020-11-26T09:17:04Z,2020-11-26T09:18:54Z,Agree,Agree,Agree,Strongly agree,Somewhat agree,Agree,Strongly agree +d118347494,2020-11-27T09:03:36Z,2020-11-27T09:04:51Z,Disagree,Agree,Disagree,Agree,Neither agree nor disagree,Disagree,Strongly disagree +976f556a93,2020-11-27T11:10:39Z,2020-11-27T11:12:07Z,Somewhat agree,Somewhat disagree,Strongly agree,Strongly agree,Strongly disagree,Somewhat disagree,Agree +e61864df16,2020-11-26T14:47:14Z,2020-11-26T14:48:06Z,Disagree,Disagree,Somewhat agree,Agree,Agree,Strongly disagree,Strongly agree +6ee3759e4d,2020-11-27T10:39:24Z,2020-11-27T11:14:29Z,Strongly agree,Disagree,Somewhat agree,Strongly agree,Agree,Somewhat disagree,Agree +a0b72f8c48,2020-11-26T11:17:49Z,2020-11-26T11:19:01Z,Agree,Disagree,Agree,Strongly agree,Strongly agree,Disagree,Strongly agree +2b1d98277d,2020-11-26T14:33:53Z,2020-11-26T14:35:45Z,Strongly agree,Disagree,Neither agree nor disagree,Agree,Agree,Strongly disagree,Agree +cf21bcf3bd,2020-11-26T14:15:59Z,2020-11-26T14:19:41Z,Neither agree nor disagree,Agree,Somewhat disagree,Agree,Somewhat disagree,Agree,Somewhat disagree +736f0e7b95,2020-11-27T14:13:43Z,2020-11-27T14:16:14Z,Strongly agree,Somewhat disagree,Neither agree nor disagree,Strongly agree,Strongly disagree,Agree,Strongly disagree +f85f74800d,2020-11-26T11:22:04Z,2020-11-26T11:22:47Z,Agree,Somewhat agree,Neither agree nor disagree,Somewhat agree,Agree,Somewhat agree,Somewhat agree +26a5d2c885,2020-11-26T14:32:23Z,2020-11-26T14:42:37Z,Neither agree nor disagree,Somewhat agree,Somewhat disagree,Strongly agree,Somewhat agree,Agree,Strongly agree +22dd74cdc7,2020-11-27T14:13:15Z,2020-11-27T14:13:49Z,Somewhat agree,Agree,Disagree,Strongly agree,Strongly disagree,Strongly agree,Disagree +b55fcd599f,2020-11-26T09:17:54Z,2020-11-26T09:19:33Z,Somewhat agree,Somewhat agree,Strongly agree,Strongly agree,Somewhat agree,Somewhat disagree,Strongly agree +ee4132f47b,2020-11-27T14:25:17Z,2020-11-27T14:26:15Z,Somewhat disagree,Somewhat disagree,Neither agree nor disagree,Agree,Somewhat agree,Somewhat agree,Strongly agree +c4d840fb2a,2020-11-26T09:15:33Z,2020-11-26T09:16:34Z,Agree,Somewhat disagree,Somewhat agree,Agree,Agree,Somewhat agree,Strongly agree +1338277ca7,2020-11-27T11:25:36Z,2020-11-27T11:27:04Z,Strongly agree,Somewhat agree,Somewhat agree,Agree,Somewhat disagree,Somewhat agree,Strongly agree +711c2cf8a3,2020-11-27T09:20:06Z,2020-11-27T09:21:23Z,Somewhat agree,Disagree,Strongly agree,Agree,Disagree,Somewhat disagree,Strongly agree +ee21d54a71,2020-11-27T14:12:13Z,2020-11-27T14:15:35Z,Disagree,Somewhat agree,Agree,Agree,Neither agree nor disagree,Agree,Strongly agree +cff9d2970a,2020-11-26T09:35:44Z,2020-11-26T09:39:50Z,Agree,Somewhat disagree,Agree,Agree,Strongly disagree,Strongly disagree,Strongly agree +be897aa438,2020-11-26T14:15:00Z,2020-11-26T14:17:38Z,Somewhat agree,Neither agree nor disagree,Somewhat disagree,Agree,Disagree,Somewhat disagree,Disagree +922430c712,2020-11-26T14:08:18Z,2020-11-26T14:15:16Z,Agree,Somewhat agree,Neither agree nor disagree,Somewhat agree,Somewhat agree,Agree,Somewhat agree +74c6036c5a,2020-11-26T14:12:46Z,2020-11-26T14:14:23Z,Agree,Somewhat disagree,Agree,Strongly agree,Somewhat agree,Agree,Somewhat agree +6cf2a93dd9,2020-11-26T09:17:59Z,2020-11-26T09:19:45Z,Agree,Somewhat disagree,Strongly agree,Agree,Somewhat agree,Disagree,Agree +f8fbdddad4,2020-11-27T14:14:17Z,2020-11-27T14:16:05Z,Somewhat disagree,Disagree,Somewhat agree,Agree,Strongly agree,Somewhat disagree,Neither agree nor disagree +84bfe8b5b0,2020-11-26T14:09:03Z,2020-11-26T14:09:59Z,Strongly disagree,Strongly agree,Disagree,Disagree,Strongly disagree,Strongly agree,Neither agree nor disagree +b308d4636c,2020-11-26T09:11:38Z,2020-11-26T09:13:08Z,Strongly agree,Neither agree nor disagree,Somewhat agree,Agree,Somewhat disagree,Somewhat agree,Strongly agree +1529f83cc8,2020-11-25T17:40:38Z,2020-11-25T17:40:49Z,Somewhat disagree,Disagree,Somewhat disagree,Disagree,Somewhat disagree,Disagree,Somewhat disagree +5e00894bbd,2020-11-27T11:15:41Z,2020-11-27T11:16:55Z,Disagree,Somewhat agree,Disagree,Strongly agree,Strongly disagree,Strongly agree,Strongly agree +905a8d0bba,2020-11-27T14:13:51Z,2020-11-27T14:15:11Z,Agree,Disagree,Neither agree nor disagree,Agree,Agree,Somewhat disagree,Strongly agree +096c49cd8a,2020-11-26T11:17:41Z,2020-11-26T11:19:42Z,Agree,Agree,Strongly agree,Strongly agree,Neither agree nor disagree,Somewhat disagree,Agree +cab36a82c5,2020-11-27T14:14:12Z,2020-11-27T14:15:24Z,Somewhat agree,Agree,Somewhat agree,Strongly agree,Strongly disagree,Somewhat agree,Strongly agree +275d7f96dc,2020-11-26T11:13:57Z,2020-11-26T11:14:43Z,Somewhat agree,Somewhat agree,Strongly agree,Strongly agree,Disagree,Somewhat agree,Strongly agree +ff0240fc51,2020-11-26T11:09:59Z,2020-11-26T11:10:49Z,Somewhat agree,Agree,Disagree,Agree,Neither agree nor disagree,Somewhat agree,Somewhat disagree +25c54381cd,2020-11-26T11:12:40Z,2020-11-26T11:19:16Z,Strongly agree,Strongly disagree,Agree,Strongly agree,Strongly agree,Strongly disagree,Strongly agree +2ee5641475,2020-11-26T11:19:08Z,2020-11-26T11:20:00Z,Strongly agree,Somewhat disagree,Agree,Agree,Neither agree nor disagree,Disagree,Strongly agree +54b8255362,2020-11-26T09:19:17Z,2020-11-26T09:19:56Z,Somewhat disagree,Agree,Somewhat agree,Agree,Agree,Somewhat agree,Somewhat disagree +c6238f274a,2020-11-27T14:12:50Z,2020-11-27T14:13:50Z,Agree,Strongly disagree,Somewhat agree,Agree,Somewhat agree,Disagree,Strongly agree +0cbb730042,2020-11-26T09:13:26Z,2020-11-26T09:14:01Z,Agree,Disagree,Somewhat agree,Somewhat agree,Somewhat agree,Somewhat disagree,Agree +22cf67ed51,2020-11-26T14:52:13Z,2020-11-26T14:53:14Z,Somewhat agree,Somewhat agree,Agree,Somewhat disagree,Strongly disagree,Agree,Agree +74db260cc9,2020-11-26T09:14:41Z,2020-11-26T09:18:11Z,Somewhat agree,Agree,Agree,Somewhat agree,Agree,Agree,Agree +5cb0b1e1a0,2020-11-26T09:15:16Z,2020-11-26T09:19:28Z,Somewhat agree,Disagree,Somewhat agree,Somewhat agree,Neither agree nor disagree,Agree,Strongly agree +bec6f0addc,2020-11-30T09:07:18Z,2020-11-30T09:08:34Z,Agree,Somewhat agree,Disagree,Somewhat agree,Strongly disagree,Somewhat agree,Somewhat agree +8b291015f5,2020-11-27T09:10:16Z,2020-11-27T09:15:34Z,Somewhat agree,Disagree,Somewhat agree,Strongly agree,Agree,Somewhat agree,Strongly agree +d6e8636688,2020-11-26T14:26:06Z,2020-11-26T14:27:21Z,Somewhat disagree,Somewhat agree,Disagree,Agree,Strongly disagree,Neither agree nor disagree,Somewhat disagree +679df0ff1d,2020-11-27T11:13:35Z,2020-11-27T11:14:26Z,Neither agree nor disagree,Agree,Somewhat agree,Agree,Strongly disagree,Strongly agree,Strongly agree +3cfcf99d67,2020-11-27T09:21:46Z,2020-11-27T09:22:43Z,Strongly agree,Neither agree nor disagree,Disagree,Agree,Strongly agree,Somewhat agree,Neither agree nor disagree +3f5a709ea5,2020-11-26T09:15:22Z,2020-11-26T09:17:06Z,Agree,Somewhat disagree,Disagree,Agree,Somewhat disagree,Disagree,Agree +080f29565b,2020-11-26T14:18:44Z,2020-11-26T14:20:14Z,Agree,Agree,Somewhat agree,Strongly agree,Somewhat agree,Somewhat disagree,Neither agree nor disagree +2a7f43bbba,2020-11-26T09:30:35Z,2020-11-26T09:32:47Z,Somewhat agree,Somewhat disagree,Somewhat agree,Agree,Somewhat agree,Somewhat disagree,Strongly agree +6b1631d235,2020-11-26T14:37:30Z,2020-11-26T14:38:59Z,Agree,Somewhat disagree,Somewhat agree,Strongly agree,Agree,Somewhat agree,Strongly agree +9b4369a809,2020-11-27T09:16:00Z,2020-11-27T09:17:30Z,Neither agree nor disagree,Neither agree nor disagree,Disagree,Agree,Strongly agree,Somewhat agree,Somewhat agree +1f038fbe3e,2020-11-26T14:16:12Z,2020-11-26T14:17:26Z,Strongly agree,Strongly disagree,Agree,Strongly agree,Neither agree nor disagree,Disagree,Agree +03a21dab04,2020-11-26T11:11:39Z,2020-11-26T11:12:20Z,Strongly disagree,Somewhat agree,Disagree,Agree,Somewhat disagree,Disagree,Somewhat agree +5d7b4887c1,2020-11-26T14:14:32Z,2020-11-26T14:16:13Z,Agree,Somewhat disagree,Somewhat agree,Strongly agree,Agree,Strongly disagree,Strongly agree +1e6b22e63f,2020-11-26T14:22:48Z,2020-11-26T14:27:00Z,Strongly agree,Strongly disagree,Strongly disagree,Strongly agree,Disagree,Strongly disagree,Strongly agree +4280f720ee,2020-11-27T09:16:17Z,2020-11-27T09:19:41Z,Agree,Disagree,Somewhat disagree,Agree,Somewhat agree,Strongly disagree,Strongly agree +5d024bf3cc,2020-11-27T11:18:22Z,2020-11-27T11:22:57Z,Somewhat agree,Disagree,Disagree,Agree,Somewhat agree,Strongly agree,Strongly agree +d32b1cede5,2020-11-27T14:18:46Z,2020-11-27T14:19:53Z,Agree,Somewhat agree,Strongly disagree,Agree,Somewhat agree,Strongly disagree,Somewhat agree +7bd9665da0,2020-11-26T14:16:00Z,2020-11-26T14:17:57Z,Strongly agree,Disagree,Disagree,Agree,Strongly disagree,Disagree,Neither agree nor disagree +11c1e5e763,2020-11-25T17:43:10Z,2020-11-25T17:43:22Z,Strongly disagree,Strongly disagree,Strongly disagree,Strongly disagree,Strongly disagree,Strongly disagree,Strongly disagree +11c1e5e763,2020-11-25T17:44:18Z,2020-11-25T17:44:26Z,Strongly disagree,Strongly disagree,Strongly disagree,Strongly disagree,Strongly disagree,Strongly disagree,Strongly disagree +2f5c35e0de,2020-11-26T14:19:12Z,2020-11-26T14:21:15Z,Somewhat agree,Agree,Somewhat agree,Strongly agree,Somewhat agree,Agree,Strongly disagree +93810528a7,2020-11-26T14:17:03Z,2020-11-26T14:19:28Z,Disagree,Strongly agree,Somewhat agree,Agree,Agree,Somewhat agree,Disagree +c04322287e,2020-11-26T11:21:52Z,2020-11-26T11:28:12Z,Agree,Somewhat agree,Somewhat agree,Strongly agree,Agree,Neither agree nor disagree,Strongly agree +48aec8a86c,2020-11-27T09:17:55Z,2020-11-27T09:18:47Z,Strongly agree,Strongly disagree,Agree,Agree,Somewhat disagree,Strongly disagree,Strongly agree +9995d64e32,2020-11-26T14:15:33Z,2020-11-26T14:16:48Z,Somewhat disagree,Strongly agree,Neither agree nor disagree,Agree,Disagree,Agree,Strongly agree +e75d626948,2020-11-27T09:15:05Z,2020-11-27T09:16:25Z,Somewhat agree,Agree,Agree,Somewhat disagree,Agree,Agree,Agree +db59d4586a,2020-11-26T09:20:25Z,2020-11-26T09:21:24Z,Somewhat agree,Somewhat disagree,Agree,Strongly agree,Agree,Somewhat disagree,Somewhat disagree +f73160e99c,2020-11-26T03:05:53Z,2020-11-26T03:06:58Z,Somewhat disagree,Somewhat agree,Agree,Agree,Somewhat disagree,Neither agree nor disagree,Strongly agree +7fab0e3529,2020-11-26T14:13:54Z,2020-11-26T14:14:53Z,Strongly disagree,Strongly disagree,Strongly disagree,Strongly disagree,Strongly disagree,Somewhat disagree,Somewhat agree +7fab0e3529,2020-11-26T14:14:13Z,2020-11-26T14:15:12Z,Strongly disagree,Strongly disagree,Strongly disagree,Strongly disagree,Strongly disagree,Somewhat disagree,Somewhat agree +4f8da97019,2020-11-26T14:18:27Z,2020-11-26T14:20:27Z,Agree,Somewhat agree,Neither agree nor disagree,Neither agree nor disagree,Neither agree nor disagree,Somewhat disagree,Somewhat agree +ffe87965a8,2020-11-27T11:16:29Z,2020-11-27T11:17:33Z,Agree,Somewhat agree,Somewhat disagree,Agree,Agree,Neither agree nor disagree,Agree +7e3abf5784,2020-11-27T14:10:29Z,2020-11-27T14:17:28Z,Agree,Strongly disagree,Strongly agree,Agree,Agree,Strongly disagree,Strongly agree +5429d7c472,2020-11-27T11:19:22Z,2020-11-27T11:20:42Z,Somewhat agree,Somewhat agree,Somewhat agree,Strongly agree,Somewhat agree,Somewhat agree,Somewhat agree +be71d29fd7,2020-11-26T09:35:40Z,2020-11-26T09:36:39Z,Agree,Somewhat disagree,Somewhat agree,Agree,Agree,Disagree,Somewhat agree +b4dbf244b7,2020-11-27T14:11:00Z,2020-11-27T14:11:45Z,Agree,Somewhat disagree,Agree,Agree,Strongly agree,Disagree,Strongly agree +873236ee4e,2020-11-26T09:12:56Z,2020-11-26T09:13:36Z,Somewhat agree,Somewhat agree,Somewhat agree,Somewhat agree,Neither agree nor disagree,Agree,Agree +ed945ff1d6,2020-11-27T09:12:12Z,2020-11-27T09:13:03Z,Somewhat agree,Agree,Disagree,Strongly agree,Disagree,Agree,Strongly disagree +d92e539706,2020-11-26T09:14:19Z,2020-11-26T09:17:16Z,Disagree,Somewhat agree,Somewhat agree,Somewhat agree,Somewhat agree,Agree,Strongly agree +6039862e1d,2020-11-26T09:40:07Z,2020-11-26T09:41:39Z,Agree,Disagree,Agree,Somewhat agree,Somewhat agree,Disagree,Strongly agree +faf4e76cd7,2020-11-27T14:14:45Z,2020-11-27T14:15:40Z,Somewhat agree,Somewhat disagree,Somewhat agree,Agree,Disagree,Somewhat agree,Somewhat disagree +5146a2a9c3,2020-11-27T14:17:30Z,2020-11-27T14:18:33Z,Neither agree nor disagree,Somewhat agree,Agree,Agree,Somewhat disagree,Disagree,Agree +08355b9d0a,2020-11-27T09:16:08Z,2020-11-27T09:17:05Z,Agree,Somewhat agree,Agree,Strongly agree,Strongly agree,Agree,Strongly agree +5b48721f8a,2020-11-26T11:26:37Z,2020-11-26T11:27:24Z,Somewhat agree,Somewhat disagree,Somewhat agree,Agree,Strongly agree,Somewhat disagree,Agree +0aa00897ec,2020-11-27T09:22:39Z,2020-11-27T09:27:09Z,Strongly disagree,Strongly agree,Disagree,Somewhat agree,Disagree,Strongly agree,Strongly agree +c61c6925cb,2020-11-26T09:19:34Z,2020-11-26T09:20:08Z,Strongly agree,Neither agree nor disagree,Neither agree nor disagree,Strongly agree,Disagree,Somewhat disagree,Strongly agree +db6c9a11cc,2020-11-27T11:17:17Z,2020-11-27T11:18:59Z,Somewhat agree,Agree,Agree,Agree,Agree,Somewhat agree,Agree +7bc47305d9,2020-11-27T11:12:04Z,2020-11-27T11:13:33Z,Neither agree nor disagree,Somewhat agree,Somewhat agree,Agree,Disagree,Somewhat agree,Strongly agree +cda5e493b6,2020-11-26T09:13:14Z,2020-11-26T09:14:09Z,Somewhat agree,Somewhat disagree,Somewhat agree,Strongly agree,Somewhat disagree,Disagree,Strongly agree +bebede31aa,2020-11-27T14:13:21Z,2020-11-27T14:14:01Z,Somewhat agree,Agree,Strongly disagree,Somewhat disagree,Disagree,Somewhat disagree,Strongly agree +c457cdc907,2020-11-26T11:11:23Z,2020-11-26T11:11:57Z,Agree,Somewhat disagree,Agree,Agree,Somewhat disagree,Somewhat disagree,Strongly agree +1c504da715,2020-11-27T09:16:48Z,2020-11-27T09:17:40Z,Somewhat agree,Somewhat agree,Agree,Strongly agree,Somewhat agree,Somewhat agree,Strongly agree +2440d0cd13,2020-11-27T09:17:18Z,2020-11-27T09:17:58Z,Agree,Somewhat agree,Agree,Strongly agree,Somewhat agree,Agree,Somewhat agree +fb7fdb74c2,2020-11-27T14:21:18Z,2020-11-27T14:21:50Z,Somewhat agree,Neither agree nor disagree,Agree,Agree,Somewhat agree,Disagree,Agree +d6c438997a,2020-11-27T09:12:45Z,2020-11-27T09:14:57Z,Strongly agree,Disagree,Somewhat agree,Strongly agree,Strongly agree,Somewhat disagree,Agree +02bc11a171,2020-11-26T11:13:59Z,2020-11-26T11:15:00Z,Somewhat agree,Somewhat agree,Agree,Agree,Agree,Agree,Strongly agree +24a9bc5884,2020-11-26T11:12:07Z,2020-11-26T11:14:37Z,Agree,Disagree,Somewhat agree,Strongly agree,Agree,Strongly disagree,Strongly agree +17c753a6ca,2020-11-27T14:26:01Z,2020-11-27T14:26:09Z,Somewhat disagree,Disagree,Somewhat agree,Agree,Agree,Somewhat disagree,Agree diff --git a/website/data/sweets.csv b/website/data/sweets.csv new file mode 100644 index 0000000..e167174 --- /dev/null +++ b/website/data/sweets.csv @@ -0,0 +1,5 @@ +ID,Start time,Completion time,Email,Name,How much do you like sweets?,How much do you like chocolate,Gender +6,2019-05-24T09:12:31Z,2019-05-24T09:12:35Z,anonymous,NA,I don't like them,I don't like them,M +7,2019-05-24T09:12:31Z,2019-05-24T09:12:35Z,anonymous,NA,I'm neutral,I don't like them,F +8,2019-05-24T09:12:31Z,2019-05-24T09:12:35Z,anonymous,NA,I like them,I'm neutral,M +9,2019-05-24T09:12:31Z,2019-05-24T09:12:35Z,anonymous,NA,I'm neutral,I'm neutral,F diff --git a/website/images/recodewithchatgpt.png b/website/images/recodewithchatgpt.png new file mode 100644 index 0000000..de3a2cc Binary files /dev/null and b/website/images/recodewithchatgpt.png differ diff --git a/website/index.rmd b/website/index.rmd index f8b210f..8e0c882 100644 --- a/website/index.rmd +++ b/website/index.rmd @@ -20,7 +20,6 @@ source('_first_chunk.R') knitr::opts_chunk$set(echo = TRUE, collapse=F, comment=NA, - cache = F, include=T, message=FALSE) ``` @@ -92,8 +91,7 @@ We will use short [LifesavR course](http://benwhalley.github.io/lifesavR/). We c ### Part 2: Data handling and visualisation - Session 5: [Data visualisation](visualisation1.html) -- Session 6: Data wrangling - +- Session 6: [Data wrangling](data-wrangling1.html) - Session 7: Assessment support diff --git a/website/visualisation1.rmd b/website/visualisation1.rmd index c12eef7..1f7f13e 100644 --- a/website/visualisation1.rmd +++ b/website/visualisation1.rmd @@ -56,12 +56,10 @@ Use this new `.rmd` file to save your work during this session. `r embed_youtube("Ek9rFSAq3QU")` - ::: - # Recreate the Rosling plot *"Multi-dimensional plotting"* sounds fancy, but it just means linking different visual features of