generated from statOmics/Rmd-website
-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathheartMainInteraction_HW_otherEncoding.Rmd
644 lines (482 loc) · 17.8 KB
/
heartMainInteraction_HW_otherEncoding.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
---
title: "Proteomics data analysis: heart _ HW"
author: "Lieven Clement"
date: "statOmics, Ghent University (https://statomics.github.io)"
output:
html_document:
code_download: true
theme: cosmo
toc: true
toc_float: true
highlight: tango
number_sections: true
---
# Background
Researchers have assessed the proteome in different regions of the heart for 3 patients (identifiers 3, 4, and 8). For each patient they sampled the left atrium (LA), right atrium (RA), left ventricle (LV) and the right ventricle (RV). The data are a small subset of the public dataset PXD006675 on PRIDE.
Suppose that researchers are mainly interested in comparing the ventricular to the atrial proteome. Particularly, they would like to compare the left atrium to the left ventricle, the right atrium to the right ventricle, the average ventricular vs atrial proteome and if ventricular vs atrial proteome shifts differ between left and right heart region.
# Data
We first import the peptides.txt file. This is the file that contains your peptide-level intensities. For a MaxQuant search [6], this peptides.txt file can be found by default in the "path_to_raw_files/combined/txt/" folder from the MaxQuant output, with "path_to_raw_files" the folder where raw files were saved. In this tutorial, we will use a MaxQuant peptides file from MaxQuant that can be found in the data tree of the SGA2020 github repository https://github.com/statOmics/SGA2020/tree/data/quantification/heart .
To import the data we use the `QFeatures` package.
We generate the object peptideRawFile with the path to the peptideRaws.txt file.
Using the `grepEcols` function, we find the columns that contain the expression
data of the peptideRaws in the peptideRaws.txt file.
```{r, warning=FALSE, message=FALSE}
library(tidyverse)
library(limma)
library(QFeatures)
library(msqrob2)
library(plotly)
peptidesFile <- "https://raw.githubusercontent.com/statOmics/PDA21/data/quantification/heart/peptides.txt"
ecols <- grep("Intensity\\.", names(read.delim(peptidesFile)))
pe <- readQFeatures(
assayData = read.delim(peptidesFile),
fnames = 1,
quantCols = ecols,
name = "peptideRaw")
pe
pe[["peptideRaw"]]
```
We will make use from data wrangling functionalities from the tidyverse package.
The %>% operator allows us to pipe the output of one function to the next function.
```{r}
colData(pe)$location <- substr(
colnames(pe[["peptideRaw"]]),
11,
11) %>%
unlist %>%
as.factor
colData(pe)$tissue <- substr(
colnames(pe[["peptideRaw"]]),
12,
12) %>%
unlist %>%
as.factor
colData(pe)$patient <- substr(
colnames(pe[["peptideRaw"]]),
13,
13) %>%
unlist %>%
as.factor
colData(pe)$compartment <- paste0(colData(pe)$location,colData(pe)$tissue) %>% as.factor
```
We calculate how many non zero intensities we have per peptide and this
will be useful for filtering.
```{r}
rowData(pe[["peptideRaw"]])$nNonZero <- rowSums(assay(pe[["peptideRaw"]]) > 0)
```
Peptides with zero intensities are missing peptides and should be represent
with a `NA` value rather than `0`.
```{r}
pe <- zeroIsNA(pe, "peptideRaw") # convert 0 to NA
```
## Data exploration
`r format(mean(is.na(assay(pe[["peptideRaw"]])))*100,digits=2)`% of all peptide
intensities are missing and for some peptides we do not even measure a signal
in any sample. The missingness is similar across samples.
# Preprocessing
This section preforms standard preprocessing for the peptide data. This
include log transformation, filtering and summarisation of the data.
## Log transform the data
```{r}
pe <- logTransform(pe, base = 2, i = "peptideRaw", name = "peptideLog")
limma::plotDensities(assay(pe[["peptideLog"]]))
```
## Filtering
### Handling overlapping protein groups
In our approach a peptide can map to multiple proteins, as long as there is
none of these proteins present in a smaller subgroup.
```{r}
pe <- filterFeatures(pe, ~ Proteins %in% smallestUniqueGroups(rowData(pe[["peptideLog"]])$Proteins))
```
### Remove reverse sequences (decoys) and contaminants
We now remove the contaminants, peptides that map to decoy sequences, and proteins
which were only identified by peptides with modifications.
First look to the names of the variables for the peptide features
```{r}
pe[["peptideLog"]] %>%
rowData %>%
names
```
No information on decoys.
```{r}
pe <- filterFeatures(pe,~ Potential.contaminant != "+")
```
### Drop peptides that were only identified in one sample
We keep peptides that were observed at last twice.
```{r}
pe <- filterFeatures(pe,~nNonZero >= 2)
nrow(pe[["peptideLog"]])
```
We keep `r nrow(pe[["peptideLog"]])` peptides after filtering.
## Normalize the data
```{r}
pe <- normalize(pe,
i = "peptideLog",
name = "peptideNorm",
method = "center.median")
```
## Explore normalized data
After normalisation the density curves for all samples are comparable.
```{r}
limma::plotDensities(assay(pe[["peptideNorm"]]))
```
This is more clearly seen is a boxplot.
```{r,}
boxplot(assay(pe[["peptideNorm"]]), col = palette()[-1],
main = "Peptide distribtutions after normalisation", ylab = "intensity")
```
We can visualize our data using a Multi Dimensional Scaling plot,
eg. as provided by the `limma` package.
```{r}
limma::plotMDS(assay(pe[["peptideNorm"]]),
col = colData(pe)$location:colData(pe)$tissue %>%
as.numeric,
labels = colData(pe) %>%
rownames %>%
substr(start = 11, stop = 13)
)
```
The first axis in the plot is showing the leading log fold changes
(differences on the log scale) between the samples.
## Summarization to protein level
We use robust summarization in aggregateFeatures. This is the default workflow of aggregateFeatures so you do not have to specifiy the argument `fun`.
However, because we compare methods we have included the `fun` argument to show the summarization method explicitely.
```{r,warning=FALSE}
pe <- aggregateFeatures(pe,
i = "peptideNorm",
fcol = "Proteins",
na.rm = TRUE,
name = "proteinRobust",
fun = MsCoreUtils::robustSummary)
```
```{r}
plotMDS(assay(pe[["proteinRobust"]]),
col = colData(pe)$location:colData(pe)$tissue %>%
as.numeric,
labels = colData(pe) %>%
rownames %>%
substr(start = 11, stop = 13)
)
```
# Data Analysis
## Estimation
We model the protein level expression values using `msqrob`.
By default `msqrob2` estimates the model parameters using robust regression.
```{r, warning=FALSE}
pe <- msqrob(
object = pe,
i = "proteinRobust",
formula = ~ compartment + patient)
```
## Inference
Explore Design
```{r}
library(ExploreModelMatrix)
VisualizeDesign(colData(pe),~ compartment + patient)$plotlist
```
$$
\begin{array}{ccl}
y_i &=& \beta_0 + \beta_{RA}x_{RA,i} + \beta_{LV}x_{LV,i} + \beta_{RV}x_{VR,i} +\beta_{p4}x_{p4,i} + \beta_{p8}x_{p8,i} + \epsilon_i\\
\epsilon_i&\sim& N(0,
\sigma^2)
\end{array}
$$
Means for patient "."?
$$
\begin{array}{ccc}
\mu^{LA}_. &=& \beta_0 + \beta_{p.}\\
\mu^{RA}_. &=& \beta_0 + \beta_{RA}+ \beta_{p.}\\
\mu^{LV}_. &=& \beta_0 + \beta_{LV}+ \beta_{p.}\\
\mu^{RV}_. &=& \beta_0 + \beta_{RV}+ \beta_{p.}
\end{array}
$$
$\log_2$ FC after correcting for patient?
$$
\begin{array}{lcc}
\log_2 \text{FC}_{V-A}^L &=& \beta_{LV}\\
\log_2 \text{FC}_{V-A}^R &=& \beta_{RV} - \beta_{RA}\\
\log_2 \text{FC}_{V-R}^{AVG} &=& \frac{1}{2}\beta_{LV} + \frac{1}{2} \beta_{RV} - \frac{1}{2} \beta_{RA}\\
\log_2 \text{FC}_{V-A}^R - \log \text{FC}_{V-A}^L &=& \beta_{RV} - \beta_{RA} - \beta_{LV}\\
\log_2 \text{FC}_{LA-1/3(LV+RA+RV)} &=& - \frac{1}{3}\beta_{LV} - \frac{1}{3}\beta_{RV} - \frac{1}{3}\beta_{RA}
\end{array}
$$
```{r}
design <- model.matrix(~compartment + patient, data = colData(pe))
L <- makeContrast(
c(
"compartmentLV = 0",
"compartmentRV-compartmentRA = 0",
"1/2*compartmentLV+1/2*compartmentRV - 1/2 *compartmentRA = 0",
"compartmentRV-compartmentRA - compartmentLV = 0",
"- 1/3*compartmentLV - 1/3*compartmentRV - 1/3*compartmentRA = 0"
),
parameterNames = colnames(design)
)
pe <- hypothesisTest(object = pe, i = "proteinRobust", contrast = L, overwrite=TRUE)
```
## Evaluate results contrast $\log_2 FC_{V-A}^L$
### Volcano-plot
```{r,warning=FALSE}
volcanoLeft <- ggplot(rowData(pe[["proteinRobust"]])$"compartmentLV",
aes(x = logFC, y = -log10(pval), color = adjPval < 0.05)) +
geom_point(cex = 2.5) +
scale_color_manual(values = alpha(c("black", "red"), 0.5)) + theme_minimal()
volcanoLeft
```
### Heatmap
We first select the names of the proteins that were declared signficant.
```{r}
sigNamesLeft <- rowData(pe[["proteinRobust"]])$"compartmentLV" %>%
rownames_to_column("proteinRobust") %>%
dplyr::filter(adjPval<0.05) %>%
pull(proteinRobust)
heatmap(assay(pe[["proteinRobust"]])[sigNamesLeft, ])
```
There are `r length(sigNamesLeft)` proteins significantly differentially expressed at the 5% FDR level.
```{r}
rowData(pe[["proteinRobust"]])$"compartmentLV" %>%
cbind(.,rowData(pe[["proteinRobust"]])$Protein.names) %>%
na.exclude %>%
dplyr::filter(adjPval<0.05) %>%
arrange(pval) %>%
knitr::kable(.)
```
## Evaluate results contrast $\log_2 FC_{V-A}^R$
### Volcano-plot
```{r,warning=FALSE}
volcanoRight <- ggplot(rowData(pe[["proteinRobust"]])$"compartmentRV - compartmentRA",
aes(x = logFC, y = -log10(pval), color = adjPval < 0.05)) +
geom_point(cex = 2.5) +
scale_color_manual(values = alpha(c("black", "red"), 0.5)) + theme_minimal()
volcanoRight
```
### Heatmap
We first select the names of the proteins that were declared signficant.
```{r}
sigNamesRight <- rowData(pe[["proteinRobust"]])$"compartmentRV - compartmentRA" %>%
rownames_to_column("proteinRobust") %>%
dplyr::filter(adjPval<0.05) %>%
pull(proteinRobust)
heatmap(assay(pe[["proteinRobust"]])[sigNamesRight, ])
```
There are `r length(sigNamesRight)` proteins significantly differentially expressed at the 5% FDR level.
```{r}
rowData(pe[["proteinRobust"]])$"compartmentRV - compartmentRA" %>%
cbind(.,rowData(pe[["proteinRobust"]])$Protein.names) %>%
na.exclude %>%
dplyr::filter(adjPval<0.05) %>%
arrange(pval) %>%
knitr::kable(.)
```
## Evaluate results average contrast $\log_2 FC_{V-A}$
### Volcano-plot
```{r,warning=FALSE}
volcanoAvg <- ggplot(rowData(pe[["proteinRobust"]])$"1/2 * compartmentLV + 1/2 * compartmentRV - 1/2 * compartmentRA",
aes(x = logFC, y = -log10(pval), color = adjPval < 0.05)) +
geom_point(cex = 2.5) +
scale_color_manual(values = alpha(c("black", "red"), 0.5)) + theme_minimal()
volcanoAvg
```
### Heatmap
We first select the names of the proteins that were declared signficant.
```{r}
sigNamesAvg <- rowData(pe[["proteinRobust"]])$"1/2 * compartmentLV + 1/2 * compartmentRV - 1/2 * compartmentRA" %>%
rownames_to_column("proteinRobust") %>%
dplyr::filter(adjPval<0.05) %>%
pull(proteinRobust)
heatmap(assay(pe[["proteinRobust"]])[sigNamesAvg, ])
```
There are `r length(sigNamesAvg)` proteins significantly differentially expressed at the 5% FDR level.
```{r}
rowData(pe[["proteinRobust"]])$"1/2 * compartmentLV + 1/2 * compartmentRV - 1/2 * compartmentRA" %>%
cbind(.,rowData(pe[["proteinRobust"]])$Protein.names) %>%
na.exclude %>%
dplyr::filter(adjPval<0.05) %>%
arrange(pval) %>%
knitr::kable(.)
```
## Interaction
### Volcano-plot
```{r,warning=FALSE}
volcanoInt <- ggplot(rowData(pe[["proteinRobust"]])$"compartmentRV - compartmentRA - compartmentLV",
aes(x = logFC, y = -log10(pval), color = adjPval < 0.05)) +
geom_point(cex = 2.5) +
scale_color_manual(values = alpha(c("black", "red"), 0.5)) + theme_minimal()
volcanoInt
```
### Heatmap
There were no genes significant at the 5% FDR level.
We return the top 25 genes.
```{r}
sigNamesInt <- rowData(pe[["proteinRobust"]])$"compartmentRV - compartmentRA - compartmentLV" %>%
rownames_to_column("proteinRobust") %>%
dplyr::filter(adjPval<0.05) %>%
pull(proteinRobust)
hlp <- order((rowData(pe[["proteinRobust"]])$"compartmentRV - compartmentRA - compartmentLV")[,"adjPval"])[1:25]
heatmap(assay(pe[["proteinRobust"]])[hlp, ])
```
There are `r length(sigNamesInt)` proteins significantly differentially expressed at the 5% FDR level.
```{r}
rowData(pe[["proteinRobust"]])$"compartmentRV - compartmentRA - compartmentLV" %>%
cbind(.,rowData(pe[["proteinRobust"]])$Protein.names) %>%
na.exclude %>%
dplyr::filter(adjPval<0.05) %>%
arrange(pval)
```
## Fold Change LA vs average of other compartment
### Volcano-plot
```{r,warning=FALSE}
volcanoNew <- ggplot(rowData(pe[["proteinRobust"]])$"-1/3 * compartmentLV - 1/3 * compartmentRV - 1/3 * compartmentRA",
aes(x = logFC, y = -log10(pval), color = adjPval < 0.05)) +
geom_point(cex = 2.5) +
scale_color_manual(values = alpha(c("black", "red"), 0.5)) + theme_minimal()
volcanoNew
```
### Heatmap
```{r}
sigNamesNew <- rowData(pe[["proteinRobust"]])$"-1/3 * compartmentLV - 1/3 * compartmentRV - 1/3 * compartmentRA" %>%
rownames_to_column("proteinRobust") %>%
dplyr::filter(adjPval<0.05) %>%
pull(proteinRobust)
heatmap(assay(pe[["proteinRobust"]])[sigNamesNew, ])
```
There are `r length(sigNamesNew)` proteins significantly differentially expressed at the 5% FDR level.
```{r}
rowData(pe[["proteinRobust"]])$"-1/3 * compartmentLV - 1/3 * compartmentRV - 1/3 * compartmentRA" %>%
cbind(.,rowData(pe[["proteinRobust"]])$Protein.names) %>%
na.exclude %>%
dplyr::filter(adjPval<0.05) %>%
arrange(pval)
```
# Why are there differences in the results?
## Results Old Incoding
```{r, warning=FALSE}
oldEncoding <- pe <- msqrob(
object = pe,
i = "proteinRobust",
formula = ~ location*tissue + patient, modelColumnName = "msqrobModelsMainInt")
```
```{r}
designMainInt <- model.matrix(~location*tissue + patient, data = colData(pe))
LMainInt <- makeContrast(
c(
"tissueV = 0",
"tissueV + locationR:tissueV = 0",
"tissueV + 0.5*locationR:tissueV = 0","locationR:tissueV = 0",
"- 2/3*locationR - 2/3*tissueV - 1/3*locationR:tissueV = 0"
),
parameterNames = colnames(designMainInt)
)
pe <- hypothesisTest(object = pe, i = "proteinRobust", contrast = LMainInt, overwrite=TRUE, modelColumn = "msqrobModelsMainInt")
```
## Compare models
### Number of significant proteins?
```{r}
nSigMainInt <- sapply(colnames(LMainInt), function(x) rowData(pe[["proteinRobust"]])[[x]] %>% dplyr::filter(adjPval<0.05) %>% nrow)
nSigNew <- sapply(colnames(L), function(x) rowData(pe[["proteinRobust"]])[[x]] %>% dplyr::filter(adjPval<0.05) %>% nrow)
nSigMainInt
nSigNew
```
### Model fit individual protein?
```{r}
(rowData(pe[["proteinRobust"]])$"msqrobModels")[[sigNamesLeft[1]]] %>% getModel
```
```{r}
(rowData(pe[["proteinRobust"]])$"msqrobModelsMainInt")[[sigNamesLeft[1]]] %>% getModel
```
Model fit is equal!
### Main effect - Interaction encoding
$$
\begin{array}{ccl}
y_i &=& \beta_0 + \beta_Rx_{R,i} + \beta_Vx_{V,i} + \beta_{R:V}x_{R,i}x_{V,i} +\beta_{p4}x_{p4,i} + \beta_{p8}x_{p8,i} + \epsilon_i\\
\epsilon_i&\sim& N(0,
\sigma^2)
\end{array}
$$
Means for patient "."?
$$
\begin{array}{ccl}
\mu^{LA}_. &=& \beta_0 + \beta_{p.}\\
\mu^{RA}_. &=& \beta_0 + \beta_{R}+ \beta_{p.}\\
\mu^{LV}_. &=& \beta_0 + \beta_{L}+ \beta_{p.}\\
\mu^{RV}_. &=& \beta_0 + \beta_{R} + \beta_{L} + \beta_{R:V}+ \beta_{p.}
\end{array}
$$
### Compartment encoding
$$
\begin{array}{ccl}
y_i &=& \beta_0 + \beta_{RA}x_{RA,i} + \beta_{LV}x_{LV,i} + \beta_{RV}x_{VR,i} +\beta_{p4}x_{p4,i} + \beta_{p8}x_{p8,i} + \epsilon_i\\
\epsilon_i&\sim& N(0,
\sigma^2)
\end{array}
$$
Means for patient "."?
$$
\begin{array}{ccc}
\mu^{LA}_. &=& \beta_0 + \beta_{p.}\\
\mu^{RA}_. &=& \beta_0 + \beta_{RA}+ \beta_{p.}\\
\mu^{LV}_. &=& \beta_0 + \beta_{LV}+ \beta_{p.}\\
\mu^{RV}_. &=& \beta_0 + \beta_{RV}+ \beta_{p.}
\end{array}
$$
### Parameterisation coincides
$$
\begin{array}{ccl}
\hat\beta_0^\text{new} &=& \hat\beta_0^\text{mainInt}\\
\hat\beta_{p4}^\text{new} &=& \hat\beta_{p4}^\text{mainInt}\\
\hat\beta_{p8}^\text{new} &=& \hat\beta_{p8}^\text{mainInt}\\
\hat\beta_{LV}^\text{new} &=& \hat\beta_V^\text{mainInt}\\
\hat\beta_{RA}^\text{new} &=& \hat\beta_A^\text{mainInt}\\
\hat\beta_{RV}^\text{new} &=& \hat\beta_R^\text{mainInt}+\hat\beta_V^\text{mainInt}+\hat\beta_{R:V}^\text{mainInt}\\
\hat\sigma^\text{new} &=& \hat\sigma^\text{mainInt}\\
\end{array}
$$
```{r}
(rowData(pe[["proteinRobust"]])$"msqrobModels")[[sigNamesLeft[1]]] %>% getSigma
coefs <- (rowData(pe[["proteinRobust"]])$"msqrobModels")[[sigNamesLeft[1]]] %>% getCoef
coefs
```
```{r}
(rowData(pe[["proteinRobust"]])$"msqrobModelsMainInt")[[sigNamesLeft[1]]] %>% getSigma
coefsMainInt <- (rowData(pe[["proteinRobust"]])$"msqrobModelsMainInt")[[sigNamesLeft[1]]] %>% getCoef
coefsMainInt
```
```{r}
coefsMainInt[c("tissueV","locationR","locationR:tissueV")] %>% sum
```
### Posterior variance is slightly different
```{r}
(rowData(pe[["proteinRobust"]])$"msqrobModels")[[sigNamesLeft[1]]] %>% getVarPosterior()
(rowData(pe[["proteinRobust"]])$"msqrobModelsMainInt")[[sigNamesLeft[1]]] %>% getVarPosterior()
(rowData(pe[["proteinRobust"]])$"msqrobModels")[[sigNamesLeft[1]]] %>% getDfPosterior
(rowData(pe[["proteinRobust"]])$"msqrobModelsMainInt")[[sigNamesLeft[1]]] %>% getDfPosterior()
```
Why?
```{r}
modType <- sapply(rowData(pe[["proteinRobust"]])$msqrobModels, function(x) x@type)
modType %>% as.factor %>% summary()
```
```{r}
modTypeMainInt <- sapply(rowData(pe[["proteinRobust"]])$msqrobModelsMainInt, function(x) x@type)
modTypeMainInt %>% as.factor %>% summary()
```
```{r}
notFittedInOne <- which(modType!=modTypeMainInt)
modType[notFittedInOne] %>% unique()
```
```{r}
modTypeMainInt[notFittedInOne] %>% unique()
```
More fit errors when using main / interaction encoding.
Squeezing will differ slightly + slight differences in FDR control
### Example
```{r}
(rowData(pe[["proteinRobust"]])$"msqrobModelsMainInt")[[notFittedInOne[2]]] %>% getModel
(rowData(pe[["proteinRobust"]])$"msqrobModels")[[notFittedInOne[2]]] %>% getModel
```
```{r}
y <- assay(pe[["proteinRobust"]])[notFittedInOne[2],]
y
lm(y ~ -1 +design)
lm(y ~ -1 +designMainInt)
```