Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when fitting the compass model #81

Open
quinnpeters opened this issue Jan 4, 2024 · 8 comments
Open

Error when fitting the compass model #81

quinnpeters opened this issue Jan 4, 2024 · 8 comments

Comments

@quinnpeters
Copy link

When I try to fit the COMPASS Model with a large compass container, I get the following error which I cannot figure out how to solve:

fit <- COMPASS(CC_comb,
treatment = Stim == "SPIKE",
control = Stim == "DMSO",
iterations = 40000)

Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, :
arguments imply differing number of rows: 2, 7

As far as I can tell, there is no issue with the metadata or counts. The data structure of the compass container, however, must be causing some issue?

@emjbishop
Copy link

Myself and three other people are all getting the same error now when running the COMPASS function, using different datasets (including test data that used to work) and systems (linux, macOS, and windows). Is there a dependency that was updated this year causing this?

Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, :
arguments imply differing number of rows: 2, 7

@gfinak
Copy link
Member

gfinak commented Apr 23, 2024

Can you share data / code so that I might reproduce the error?

@emjbishop
Copy link

emjbishop commented Apr 23, 2024

@gfinak yes, we have a tutorial repo that uses COMPASS. You just need:

Here is a more complete error message with context:

> CC <- COMPASSContainerFromGatingSet(gs,
+                                     node = parent_node,
+                                     individual_id = id,
+                                     mp = markermap,
+                                     countFilterThreshold = 5000)
Extracting cell counts
Fetching 4+
Fetching child nodes
common markers are: 
Time FSC-A FSC-H SSC-A CD8b TNFa CD107a CD154 CD3 IL2 CD4 IL17a IL4_5_13 CD14_19 CCR7 CD38 LD IFNg CD45RA HLADR 
Extracting single cell data for 4+/IL2|4+/IL4513|4+/IFNG|4+/TNF|4+/IL17|4+/154|4+/107a
..............................Creating COMPASS Container
Filtering low counts
Filtering 0 samples due to low counts
Warning message:
In COMPASSContainer(data = sc_data, counts = counts, meta = pd,  :
  There appear to be negative intensities in the 'data' supplied.
> fit <- COMPASS(CC,
+                treatment = STIM == "Spike 1",
+                control = STIM == "DMSO",
+                iterations = 100)
There are a total of 5 samples from 5 individuals in the 'treatment' group.
There are a total of 5 samples from 5 individuals in the 'control' group.
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  : 
  arguments imply differing number of rows: 2, 7

I've also heard from a colleague (edit: @quinnpeters) that they don't get the error with a much older version of R (v3.6.3).

@malisas
Copy link
Contributor

malisas commented Aug 30, 2024

Hi @emjbishop I got your separate private message to me about this issue.
Thanks for providing the reproducible example. Just want to chime in that I was able to reproduce, and I think the issue has to do with NA values getting generated by COMPASSContainerFromGatingSet, which internally calls flowWorkspace::gs_get_singlecell_expression. I wonder if more recent R versions introduced different behavior in a dependency to this function, and caused NA values to be generated.

Here's some runtime output showing how COMPASS failed on CC but succeeded on CC2. NA values don't always get generated it seems: Both COMPASSContainers were generated by the same call to COMPASSContainerFromGatingSet, but CC contains NA values in the data used by COMPASS:

> CC <- COMPASSContainerFromGatingSet(gs,
+                                     node = parent_node,
+                                     individual_id = id,
+                                     mp = markermap,
+                                     countFilterThreshold = 5000)
Extracting cell counts
Fetching 4+
Fetching child nodes
common markers are: 
Time FSC-A FSC-H SSC-A CD8b TNFa CD107a CD154 CD3 IL2 CD4 IL17a IL4_5_13 CD14_19 CCR7 CD38 LD IFNg CD45RA HLADR 
Extracting single cell data for 4+/IL2|4+/IL4513|4+/IFNG|4+/TNF|4+/IL17|4+/154|4+/107a
..............................Creating COMPASS Container
Filtering low counts
Filtering 0 samples due to low counts
Warning message:
In COMPASSContainer(data = sc_data, counts = counts, meta = pd,  :
  There appear to be negative intensities in the 'data' supplied.


> CC2 <- COMPASSContainerFromGatingSet(gs,
+                                     node = parent_node,
+                                     individual_id = id,
+                                     mp = markermap,
+                                     countFilterThreshold = 5000)
Extracting cell counts
Fetching 4+
Fetching child nodes
common markers are: 
Time FSC-A FSC-H SSC-A CD8b TNFa CD107a CD154 CD3 IL2 CD4 IL17a IL4_5_13 CD14_19 CCR7 CD38 LD IFNg CD45RA HLADR 
Extracting single cell data for 4+/IL2|4+/IL4513|4+/IFNG|4+/TNF|4+/IL17|4+/154|4+/107a
..............................Creating COMPASS Container
Filtering low counts
Filtering 0 samples due to low counts
Warning message:
In COMPASSContainer(data = sc_data, counts = counts, meta = pd,  :
  There appear to be negative intensities in the 'data' supplied.


> CC$data$`114716.fcs_445737`[100,]
          IL2      IL4_5_13          IFNg          TNFa         IL17a         CD154        CD107a 
1.018558e-312 2.781342e-309 8.257046e-317  1.746838e+03 2.483463e-265 1.807873e-308 1.157082e-309 
> CC2$data$`114716.fcs_445737`[100,]
          IL2      IL4_5_13          IFNg          TNFa         IL17a         CD154        CD107a 
1.820499e-318 1.823898e-318 1.827297e-318  1.746838e+03 1.834095e-318 1.837494e-318 1.840894e-318 


> dmso_spike_1_sn <- CC2$meta %>% dplyr::filter(STIM %in% c("DMSO", "Spike 1")) %>% dplyr::pull(name)
> table(unlist((lapply(CC$data[dmso_spike_1_sn], function(x) {any(is.na(x))}))))

FALSE  TRUE 
    9     1 
> table(unlist((lapply(CC2$data[dmso_spike_1_sn], function(x) {any(is.na(x))}))))

FALSE 
   10 


> fit <- COMPASS(CC,
+                treatment = STIM == "Spike 1",
+                control = STIM == "DMSO",
+                iterations = 100)
There are a total of 5 samples from 5 individuals in the 'treatment' group.
There are a total of 5 samples from 5 individuals in the 'control' group.
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  : 
  arguments imply differing number of rows: 2, 7


> any(is.na(CC$data$`114716.fcs_445737`))
[1] TRUE
> any(is.na(CC2$data$`114716.fcs_445737`))
[1] FALSE


> fit2 <- COMPASS(CC2,
+                treatment = STIM == "Spike 1",
+                control = STIM == "DMSO",
+                iterations = 100)
There are a total of 5 samples from 5 individuals in the 'treatment' group.
There are a total of 5 samples from 5 individuals in the 'control' group.
The model will be run on 5 paired samples.
The category filter has removed 123 of 125 categories.
There are a total of 2 categories to be tested.
Initializing parameters...
Computing initial parameter estimates...
Keeping 100 iterations. We'll thin every 8 iterations.
Burnin for 100 iterations...
Sampling 800 iterations...
Done!
Computing the posterior difference in proportions, posterior log ratio...
Done!
> fit2
A COMPASS model fit on 5 paired samples.
> sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: Etc/UTC
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.16.0    COMPASS_1.38.1       flowWorkspace_4.12.2 flowCore_2.12.2      CytoML_2.12.0        here_1.0.1          

loaded via a namespace (and not attached):
 [1] tidyr_1.3.1         utf8_1.2.3          generics_0.1.3      lattice_0.21-8      digest_0.6.33       magrittr_2.0.3      evaluate_0.21       grid_4.3.1         
 [9] RColorBrewer_1.1-3  fastmap_1.1.1       rprojroot_2.0.3     ggcyto_1.28.1       plyr_1.8.9          jsonlite_1.8.7      graph_1.78.0        gridExtra_2.3      
[17] BiocManager_1.30.22 purrr_1.0.2         fansi_1.0.4         scales_1.3.0        XML_3.99-0.16       Rgraphviz_2.44.0    abind_1.4-5         cli_3.6.1          
[25] rlang_1.1.1         RProtoBufLib_2.12.1 Biobase_2.60.0      munsell_0.5.0       yaml_2.3.7          cytolib_2.12.1      tools_4.3.1         ncdfFlow_2.46.0    
[33] dplyr_1.1.4         colorspace_2.1-0    ggplot2_3.4.4       BiocGenerics_0.46.0 vctrs_0.6.4         R6_2.5.1            matrixStats_1.1.0   stats4_4.3.1       
[41] lifecycle_1.0.3     zlibbioc_1.46.0     S4Vectors_0.38.2    RBGL_1.76.0         clue_0.3-65         pdist_1.2.1         cluster_2.1.4       pkgconfig_2.0.3    
[49] pillar_1.9.0        hexbin_1.28.3       gtable_0.3.4        glue_1.6.2          Rcpp_1.0.11         xfun_0.40           tibble_3.2.1        tidyselect_1.2.0   
[57] knitr_1.43          rstudioapi_0.15.0   htmltools_0.5.6     rmarkdown_2.24      compiler_4.3.1     

Haven't gotten to the root of the problem but documenting the investigation so far.

@gfinak
Copy link
Member

gfinak commented Aug 30, 2024

I suggest looking at the SimpleCOMPASS interface. CompassContainerFromGatingSet makes a lot of assumptions and we ended up abandoning it internally.

@emjbishop
Copy link

emjbishop commented Aug 31, 2024

Thank you @malisas and @gfinak, we will switch to SimpleCOMPASS (which seems to work with R 4.4.1). We appreciate it and will pass this on to our collaborators.

@emjbishop
Copy link

Hello again, we appreciate the suggestion to use SimpleCOMPASS and indeed it works well. However, when we have dozens of subject samples with multiple stims and cell types it's not practical to generate the boolean combinations in FlowJo for each subject. Do you have a way to automatically generate the boolean combinations, or a method in FlowJo?

@gfinak
Copy link
Member

gfinak commented Nov 19, 2024

Indeed, we would typically export the flowJo wsp file, read it in with flowWorskpace, then there are some internal functions that could generate the boolean counts from the marginal gated data at the single cell level.

The doc for SimpleCompass highlights the functions used:

##' n_s <- CellCounts( data[1:n], Combinations(k) )
##' n_u <- CellCounts( data[(n+1):(2*n)], Combinations(k) )

CellCounts can be used generate the data you need for SimpleCompass, given inputs are a list of matrices, where each matrix is the single cell expression of cells from a cell population in a sample (e.g. CD4 T cells) for rows, and columns are each marker/protein measured and thresholded to be 0 if it's below the gate (i.e. negative).
With some work on a GatingSet created from the FCS files and an imported flowJo workspace, you can generate these matrices, and the required data (left as an exercise for the reader).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants