-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support exclusion of specific markers and marker combinations from the Polyfunctionality Score calculation #62
Comments
Here's a reprex of how to do this currently: The first part is boilerplate to construct an actual fitted library(COMPASS)
library(readxl)
library(dplyr)
#> Warning: package 'dplyr' was built under R version 3.5.1
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tidyr)
library(assertthat)
f <- system.file("extdata","SimpleCOMPASSExample.xlsx",package="COMPASS")
data <- read_excel(f)
# tidy the input count matrix
# rename the subsets, keep only the booleans, ensure consistent ordering
tidy_data <- data %>%
gather(Population, Count, -c(1:3)) %>%
mutate(Population = gsub("S/Avid-/L/CD3\\+/CD4\\+","CD4",Population)) %>%
mutate(Population = gsub(",Count","",Population)) %>%
mutate(Population = gsub("22","IL22",Population)) %>%
mutate(Population = gsub("10","IL10",Population)) %>%
mutate(Population = gsub("17A","IL17A",Population)) %>%
mutate(Population = gsub("17F","IL17F",Population)) %>%
mutate(Population = gsub("MIP","MIP1B",Population)) %>%
mutate(Population = gsub("TNF","TNFa",Population)) %>%
mutate(Population = translate_marker_names(Population)) %>%
mutate(Population = gsub("([&!])2&","\\1IL2&",Population)) %>%
mutate(Population = gsub("CD4/","",Population)) %>%
filter(Population != "CD4") %>%
mutate(Parent = "CD4") %>%
spread(Population, Count) %>%
arrange(`PATIENT ID`,Stimulation, GROUP, Parent) %>%
mutate(Stimulation = toupper(Stimulation))
# get the stim and unstim data into separate data frames
nu = tidy_data %>% filter(Stimulation == "UNSTIMULATED") %>% arrange(`PATIENT ID`)
ns = tidy_data %>% filter(Stimulation != "UNSTIMULATED") %>% arrange(`PATIENT ID`)
# order of subject ids should be the same
assertthat::are_equal(nu$`PATIENT ID`,ns$`PATIENT ID`)
#> [1] TRUE
# extract the metadata and the stim and unstim count matrices
metadata <- ns %>% select(`PATIENT ID`,GROUP,Stimulation,Parent)
nu <- nu %>% select(-`PATIENT ID`,-GROUP,-Stimulation,-Parent) %>% as.matrix
ns <- ns %>% select(-`PATIENT ID`,-GROUP,-Stimulation,-Parent) %>% as.matrix
# reverse the order of the cell subsets, so that the all-negative is the last column
nu <- nu[,rev(colnames(nu))]
ns <- ns[,rev(colnames(ns))]
# rownames must be set
rownames(nu) <- metadata$`PATIENT ID`
rownames(ns) <- metadata$`PATIENT ID`
metadata <- as.data.frame(metadata)
rownames(metadata) = metadata$`PATIENT ID`
# quick fit via SimpleCOMPASS interface
simplefit <- SimpleCOMPASS(n_s = ns, n_u = nu, meta = metadata, individual_id = "PATIENT ID",iterations = 1000, replications = 1)
#> Initializing parameters...
#> Computing initial parameter estimates...
#> Iteration 1000 of 1000.
#> Fitting model with 1 replications.
#> Running replication 1 of 1...
#> Iteration 1000 of 1000.
#> Done!
# Now compute the polyfunctionality score excluding IL17A and IL17F and IL17A/IL17F
## first get indices of the three cell subsets from the categories matrix
il17a_single <- as.data.frame(simplefit$data$categories)$IL17A == 1 & as.data.frame(simplefit$data$categories)$Counts == 1
il17f_single <- as.data.frame(simplefit$data$categories)$IL17F == 1 & as.data.frame(simplefit$data$categories)$Counts == 1
il17af_double <- as.data.frame(simplefit$data$categories)$IL17A == 1 &
as.data.frame(simplefit$data$categories)$IL17F == 1 &
as.data.frame(simplefit$data$categories)$Counts == 2
indices <- il17a_single|il17af_double|il17f_single
## next extract the Gamma matrix from the fit
gamma_mat <- COMPASS:::Gamma(simplefit)
dim(gamma_mat)
#> [1] 21 256 1000
# Third dimension is the iterations
# second dimension is the cell subsets
# first dimension is the subjects
# we want to average across the iterations to compute the mean_gamma matrix
# from which we will construct a PFS score.
# compute the mean_gamma matrix averaging over iterations
mean_gamma <- apply(gamma_mat[,!indices,],1:2,mean)
# get the degree of each subset in the matrix.
degree <- simplefit$data$categories[!indices,"Counts"]
## Compute the reduced score.
reduced_score <- PolyfunctionalityScore(x = mean_gamma, degree = degree, n = log(256,2))
#combine with the usual scores
merged <- scores(simplefit) %>%
arrange(`PATIENT ID`) %>%
bind_cols(reduced_PFS = reduced_score)
# do these make sense
all(merged$reduced_PFS <= merged$PFS)
#> [1] TRUE Created on 2018-08-14 by the reprex |
this can be partly accomplished with the FS and PFS APIs passing in |
The lab wants to exclude IL4 single positive, GzB single positive and IL14/GzB double positive cell combinations from the Polyfunctionality Score calculations due to artefacts detected with those antibodies.
The text was updated successfully, but these errors were encountered: