-
Notifications
You must be signed in to change notification settings - Fork 0
/
combined-generic-baseline022718.Rmd
299 lines (237 loc) · 12.8 KB
/
combined-generic-baseline022718.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
---
title: "Learning Novel Categories through Background Knowledge Using Generic Statements"
author: "Xiuyuan Zhang & Dan Yurovsky"
date: "4/17/2018"
fontsize: 11pt
output:
pdf_document:
latex_engine: xelatex
word_document: default
fig_crop: false
geometry: margin=0.75in
header-includes:
- \usepackage{pdfpages}
---
## Hypothesis
In a speaker-listner interaction scenario, where the speaker utters a true generic statement '$C$(category) are $F$(feature),' we hypothesize that, when $C$ is a novel category, the listener uses her own background knowledge to infer the prevalence rate of feature $F$ in category $C$ upon hearing the generic. In our current paradigm, we provide participants a familiar category $C_{familiar}$, $C_{familiar}$ serving as background knowledge.
## Conditions
The table below includes our feature selection, its corresponding comparison categories (chosen based on our estimate of their low, medium, and high prevalence rate) and novel categories.
Feature | Alternative Comparison Categories | Novel Category
friendly| Puppies (H), Goats (M), Squirrels (L) | Feps
tasty| Pizzas (H), Fruits (M), Vegetables (L) | Kobas
heavy| Trucks (H), Stones (M), Bikes (L) | Dands
(H = high prevalence, M = medium prevalence, L = low prevalence )
## Setup
We run three separate survey studies on Amazon Mechanical Turk. All three surveys provide participants a narrative that introduces them to an imaginary country, Akar. Sample questions from all three surveys are provided in the section below. The first survey recorded participants's evaluation of a given generic statement as True or False as well as their prevalence rate estimates for all abovementioned 9 categories (3 per feature.) The second survey introduced a novel category $C_{novel}$ along with a familiar comparison category $C_{familiar}$, then asked participants to estimate the prevalence rate of the feature $F$ in novel category $C_{novel}$. The third survey is similar to the second, with the difference that we only stated '$C_{novel}$ are like $C_{familiar}$.' before asking participants to estimate the prevalance rate of feature $F$ for $C_{novel}$. This survey serves as a sanity check, checking whether participants treat $C_{novel}$ as equivalent to $C_{familiar}$ and provide a smiliar rather than higher estimate for $C_{novel}$ comparing to estimates for $C_{familiar}$.
## Below are two examples of the MTurk survey that we run:
1. Getting the baseline prevalence rate for familiar comparison categories.
\includegraphics{survey_screenshots/baseline-021118/baseline_narrative.png}
\includegraphics{survey_screenshots/baseline-021118/baseline_generic_eval.png}
\includegraphics{survey_screenshots/baseline-021118/baseline_p_estimation.png}
2. Introducing a novel category with a generic statement, provding participants familiar comparison categories.
\includegraphics{survey_screenshots/explicit-generic-021118/explicit_generic_narrative.png}
\includegraphics{survey_screenshots/explicit-generic-021118/explicit_generic_p_estimation.png}
## Results
## Data Analysis and Plots
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
```
#### Load libraries
```{r load_libraries, message = F, warning = F}
library(tidyverse)
library(lubridate)
library(lme4)
library(DT)
library(ggpirate)
library(gridExtra)
theme_set(theme_classic(base_size = 14))
```
#### read in data
```{r, message = F}
### baseline ###
data_baseline <- read_csv("baseline-survey110218/data/020818baseline-anonymized-data.csv")
### explicit generic ###
data_generic <- read_csv("generic-survey110218/data/020818generic-anonymized-data.csv")
### implicit generic ###
data_implicit <- read_csv("implicit-generic022318/data/022318implicit-generic-anonymized-data.csv")
```
#### filter participants who passed attention check
```{r}
### baseline ###
qualtrics_filtered_baseline <- data_baseline %>%
filter(Q8 == "friendly,tasty,heavy")
### explicit generic ###
qualtrics_filtered_generic <- data_generic %>%
filter(Q10 == "kobas,feps,dands")
### implicit generic ###
qualtrics_filtered_implicit <- data_implicit %>%
filter(Q10 == "kobas,feps,dands")
```
#### Munge data
In all three surveys, the categories and features we chose are consistent.
```{r}
# use to group different levels
high_objects <- c("puppies","trucks","pizzas")
med_objects <- c("goats", "rocks", "fruits")
low_objects <- c("squirrels", "bikes", "vegetables")
features <- c("friendly", "heavy", "tasty")
# features and objects for the survey
objects <- data_frame(type = "high", object = high_objects,
feature = features) %>%
bind_rows(data_frame(type = "medium", object = med_objects,
feature = features)) %>%
bind_rows(data_frame(type = "low", object = low_objects,
feature = features))
# get data for each of the three conditions
### baseline ###
q1_baseline <- qualtrics_filtered_baseline %>%
select(id, L_feature, L_object, Q1, Q2_1) %>%
rename(feature = L_feature, baseline_object = L_object, truefalse_response = Q1, percent_response = Q2_1)
q2_baseline <- qualtrics_filtered_baseline %>%
select(id, M_feature, M_object, Q3, Q4_1) %>%
rename(feature = M_feature, baseline_object = M_object, truefalse_response = Q3, percent_response = Q4_1)
q3_baseline <- qualtrics_filtered_baseline %>%
select(id, H_feature, H_object, Q5, Q6_1) %>%
rename(feature = H_feature, baseline_object = H_object, truefalse_response = Q5, percent_response = Q6_1)
### explicit generic ###
colnames(qualtrics_filtered_generic)[colnames(qualtrics_filtered_generic) == "Duration (in seconds)"] <- "time_spent"
# get data for each of the three conditions
q1_generic <- qualtrics_filtered_generic %>%
select(id, L_feature, L_comparison, L_novel, Q2_1, time_spent) %>%
rename(feature = L_feature, baseline_object = L_comparison, novel_object = L_novel, percent_response = Q2_1)
q2_generic <- qualtrics_filtered_generic %>%
select(id, M_feature, M_comparison, M_novel, Q5_1, time_spent) %>%
rename(feature = M_feature, baseline_object = M_comparison, novel_object = M_novel,percent_response = Q5_1)
q3_generic <- qualtrics_filtered_generic %>%
select(id, H_feature, H_comparison, H_novel, Q6_1, time_spent) %>%
rename(feature = H_feature, baseline_object = H_comparison, novel_object = H_novel, percent_response = Q6_1)
### implicit generic ###
colnames(qualtrics_filtered_implicit)[colnames(qualtrics_filtered_implicit) == "Duration (in seconds)"] <- "time_spent"
# get data for each of the three conditions
q1_implicit <- qualtrics_filtered_implicit %>%
select(id, L_feature, L_comparison, L_novel, Q2_1, time_spent) %>%
rename(feature = L_feature, baseline_object = L_comparison, novel_object = L_novel, percent_response = Q2_1)
q2_implicit <- qualtrics_filtered_implicit %>%
select(id, M_feature, M_comparison, M_novel, Q5_1, time_spent) %>%
rename(feature = M_feature, baseline_object = M_comparison, novel_object = M_novel,percent_response = Q5_1)
q3_implicit <- qualtrics_filtered_implicit %>%
select(id, H_feature, H_comparison, H_novel, Q6_1, time_spent) %>%
rename(feature = H_feature, baseline_object = H_comparison, novel_object = H_novel, percent_response = Q6_1)
# combine
### baseline ###
response_data_baseline <- bind_rows(q1_baseline, q2_baseline, q3_baseline) %>%
mutate(type = if_else(baseline_object %in% high_objects, "high",
if_else(baseline_object %in% med_objects, "medium","low"))) %>%
mutate(type = factor(type, levels = c("low", "medium", "high")))
### explicit generic ###
response_data_generic <- bind_rows(q1_generic, q2_generic, q3_generic) %>%
mutate(type = if_else(baseline_object %in% high_objects, "high",
if_else(baseline_object %in% med_objects, "medium","low"))) %>%
mutate(type = factor(type, levels = c("low", "medium", "high")))
### implicit generic ###
response_data_implicit <- bind_rows(q1_implicit, q2_implicit, q3_implicit) %>%
mutate(type = if_else(baseline_object %in% high_objects, "high",
if_else(baseline_object %in% med_objects, "medium","low"))) %>%
mutate(type = factor(type, levels = c("low", "medium", "high")))
### baseline ###
# filter to include only participants who chose "Yes" to the question on evaluating generic statement
response_filtered_yes_baseline <- response_data_baseline %>%
filter(truefalse_response == "Yes")
### explicit generic ###
# filter participants based on their time of survey completion
response_data_control_for_time_generic <- response_data_generic %>%
filter(time_spent > 45)
### implicit generic ###
# filter participants based on their time of survey completion
response_data_control_for_time_implicit <- response_data_implicit %>%
filter(time_spent > 45)
```
```{r}
head(response_data_control_for_time_generic)
head(response_filtered_yes_baseline)
#with feature + type
response_filtered_yes_baseline %>%
group_by(id,type, feature) %>%
summarize(percent_response=mean(percent_response), n()) %>%
group_by(feature,type) %>%
summarize(mean=mean(percent_response), std_err=sd(percent_response)/sqrt(length(percent_response)), n())
#just type
full_data_yes <- response_filtered_yes_baseline %>%
select(-truefalse_response) %>%
mutate(condition='baseline') %>%
bind_rows(response_data_control_for_time_generic %>% mutate(condition='generic'))
full_data_yes_by_type <- full_data_yes %>%
group_by(id,type,condition) %>%
summarize(percent_response=mean(percent_response), n()) %>%
group_by(condition,type) %>%
summarize(mean=mean(percent_response), std_err=sd(percent_response)/sqrt(length(percent_response)), n())
ggplot(data=full_data_yes_by_type, aes(x=type, y=mean, color=condition)) +
geom_pointrange(aes(ymin = mean-std_err, ymax = mean+std_err),
position=position_dodge(.5)) +
coord_cartesian(ylim=c(0,100))
# NOT FILTERING YES, ALL BASELINE
response_data_baseline %>%
group_by(id,type, feature) %>%
summarize(percent_response=mean(percent_response), n()) %>%
group_by(feature,type) %>%
summarize(mean=mean(percent_response), std_err=sd(percent_response)/sqrt(length(percent_response)), n())
#just type
library(tidyboot)
full_data <- response_data_baseline %>%
select(-truefalse_response) %>%
mutate(condition='baseline') %>%
bind_rows(response_data_control_for_time_generic %>% mutate(condition='novel'))
full_data_by_type <- full_data %>%
group_by(condition,type) %>%
tidyboot_mean(percent_response)
# summarize(mean=mean(percent_response), std_err=sd(percent_response)/sqrt(length(percent_response)), n())
ggplot(data=full_data_by_type, aes(x=type, y=mean, fill=condition)) +
geom_bar(stat="identity", position="dodge")+
geom_linerange(aes(ymin = ci_lower, ymax = ci_upper), shape = 20, size = 0.4,position=position_dodge(.9)) +
coord_cartesian(ylim=c(40,100)) +
ylab("Percent(%)") +
xlab("Feature Prevalence Estimation by Level") +
ggtitle("Mean Feature Prevalence Estimation")
```
```{r}
# plot - pirate
### baseline ###
ggplot() +
geom_pointrange(data = response_filtered_yes_baseline, mapping = (aes(x = type, y = mean(percent_response), ymin = , ymax = )) )
p1_baseline <- ggplot(response_data_baseline, aes(x = type, y = percent_response, color = feature, fill = feature)) +
facet_wrap(~ feature) +
ggtitle("Baseline - All") +
geom_pirate()
p2_baseline <- ggplot(response_filtered_yes_baseline, aes(x = type, y = percent_response, color = feature, fill = feature)) +
facet_wrap(~ feature) +
ggtitle("Baseline - Yes-Only") +
geom_pirate()
### explicit generic ###
p1_generic<- ggplot(response_data_generic, aes(x = type, y = percent_response, color = feature, fill = feature)) +
facet_wrap(~ feature) +
ggtitle("Explicit Generic - All") +
geom_pirate()
p2_generic <- ggplot(response_data_control_for_time_generic, aes(x = type, y = percent_response, color = feature, fill = feature)) +
facet_wrap(~ feature) +
ggtitle("Explicit Generic - Time Spent > 45") +
geom_pirate()
### implicit generic ###
p1_implicit<- ggplot(response_data_implicit, aes(x = type, y = percent_response, color = feature, fill = feature)) +
facet_wrap(~ feature) +
ggtitle("Implicit Generic - All") +
geom_pirate()
p2_implicit <- ggplot(response_data_control_for_time_implicit, aes(x = type, y = percent_response, color = feature, fill = feature)) +
facet_wrap(~ feature) +
ggtitle("Implicit Generic - Time Spent > 45") +
geom_pirate()
#### arrange graphs
#grid.arrange(p1_generic, p2_generic, ncol = 2)
#grid.arrange(p1_baseline, p2_baseline, ncol = 2)
#grid.arrange(p1_implicit, p2_implicit, ncol = 2)
# put all graphs together
#grid.arrange(p1_baseline, p2_baseline, p1_implicit, p2_implicit, p1_generic, p2_generic, ncol = 2)
# put baseline-all, implicit-generic time spent>45, explicit generic time spent >45
#grid.arrange(p1_baseline, p2_implicit, p2_generic, ncol = 3)
# put baseline-Yes, implicit-generic time spent>45, explicit generic time spent >45
#grid.arrange(p2_baseline, p2_implicit, p2_generic, ncol = 3)
```