forked from manybabies/mb1-analysis-public
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path02_trial_merge.Qmd
138 lines (111 loc) · 4.12 KB
/
02_trial_merge.Qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
---
title: "02_trial_merge.Qmd"
format: html
---
```{r}
library(tidyverse)
library(janitor)
library(here)
```
Let's read in trial data from the different labs.
Labs right now are: Senegal, Uganda, Malawi, Rwanda, Ghana, and Kenya.
# Ghana
Ghana is in long form. First is the easiest one I think.
```{r}
ghana <- readxl::read_xlsx(here("processed_data", "trials_cleaned", "Omane - Ghana.xlsx")) |>
clean_names() |>
mutate(looking_time_s = as.numeric(lookin_time_s)) |>
rename(order = test_order) |>
select(lab, subid, order, trial_type, stimulus, trial_num, looking_time_s,
trial_error, trial_error_type)
```
Just for fun.
```{r}
ggplot(ghana,
aes(x = trial_num, y = looking_time_s, col = trial_type)) +
geom_jitter(width = .2, height = 0, alpha = .5) +
geom_smooth()
```
```{r}
ggplot(ghana, aes(x = looking_time_s, fill = trial_type)) + geom_histogram(binwidth = 1)
```
# Senegal, Malawi, Rwanda, Kenya
Next is a set in event format.
```{r}
log_labs <- c("Diop - Senegal", "Lamba - Malawi", "Mushimiyimana - Rwanda", "Ziedler - Kenya")
files <- lapply(log_labs, function (x) dir(path = here("processed_data", "trials_cleaned", x)))
log_labs_data_raw <- map_df(1:4, function(x) {
read_csv(here("processed_data", "trials_cleaned", log_labs[x], files[x])) |>
mutate(lab = log_labs[x])
}) |>
filter(!(SubjectID %in% c("Phase", "MBG_IDS"))) |>
janitor::clean_names()
```
Goal is; select(lab, subid, order, trial_type, stimulus, trial_num, looking_time_s,
trial_error, trial_error_type)
```{r}
log_labs_data <- log_labs_data_raw |>
filter(end_type != "AGAbort") |>
mutate(subid = subject_id,
order = as.numeric(str_sub(order_randomization, 7, 7)),
stimulus = str_replace(str_to_lower(stim_name), "\\_", ""),
trial_type = case_when(str_detect(stim_name, "IDS") ~ "IDS",
str_detect(stim_name, "ADS") ~ "ADS",
TRUE ~ "training"),
trial = as.numeric(trial),
trial_num = ifelse(trial < 3, trial - 3, trial - 2),
looking_time_s = total_look / 1000,
looking_time_diff = (trial_end - trial_start) / 1000,
total_center = total_center / 1000,
enabled_diff = (look_disabled - look_enabled) / 1000,
trial_error = NA,
trial_error_type = NA) |>
select(lab, subid, order, trial_type, stimulus, trial_num, looking_time_s, looking_time_diff, total_center, enabled_diff,
trial_error, trial_error_type)
```
```{r}
d <- bind_rows(ghana, log_labs_data)
```
We investigate what the different columns mean. Our hypothesis:
- total look = looking time, not including lookaways
- look_disabled - look_enabled = looking time, including lookaways but not attn getters
- trial_end - trial_start = including lookaways AND attention getters
The basic issue is that senegal has a lot of zero looking times. It appears that this is due plausibly to two intersecting issues:
1. a bug in habit such that if you hold down the key forever, you get LT = 0.
2. probable misuse of the software by holding down the key a lot.
```{r}
ggplot(d,
aes(x = trial_num, y = looking_time_s, col = trial_type)) +
geom_jitter(width = .2, height = 0, alpha = .5) +
geom_smooth() +
facet_wrap(~lab)
ggplot(d,
aes(x = trial_num, y = enabled_diff, col = trial_type)) +
geom_jitter(width = .2, height = 0, alpha = .5) +
geom_smooth() +
facet_wrap(~lab)
```
```{r}
ggplot(d,
aes(x = trial_num, y = looking_time_s, col = trial_type)) +
geom_jitter(width = .2, height = 0, alpha = .5) +
geom_smooth() +
facet_wrap(~lab)
```
```{r}
filter(d, lab != "Diop - Senegal",
looking_time_s > 2) |>
ggplot(aes(x = trial_num, y = looking_time_s, col = trial_type)) +
geom_jitter(width = .2, height = 0, alpha = .5) +
scale_y_log10() +
geom_smooth(method = "lm")
```
```{r}
mod <- lmer(log(looking_time_s) ~ trial_num * trial_type +
(1 | subid) +
(trial_type | lab),
data = filter(d, lab != "Diop - Senegal",
looking_time_s > 2,
trial_type != "training"))
summary(mod)
```