Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out of memory error #122

Open
shashwatsahay opened this issue Sep 16, 2023 · 6 comments
Open

Out of memory error #122

shashwatsahay opened this issue Sep 16, 2023 · 6 comments

Comments

@shashwatsahay
Copy link

Hi

I wanted to use your tool for correcting the bleeding effect, but whenever I ran the tool it would throw a memory error.

My count matrix is approximately 40000*3000 and I ran it on my HPC where each node has about 500gb of RAM, for this dataset the memory consumption peaked and gave me a Out of memory error is there a solution to this?

@jeffquinn-msk
Copy link
Contributor

jeffquinn-msk commented Sep 18, 2023

Hello,

Thanks for writing in. Sounds a bit strange, that is an astronomical amount of RAM certainly it should be enough to handle a 3000 spot dataset.. Is that 500GB for one server or is that distributed RAM over multiple servers? BayesTME is not significantly parallelizable so I would recommend running it one server.

The content of your message is not enough information for me to diagnose the root cause. Is there anything else you can share with me (either in this thread or privately if you would prefer?)

For instance are you running BayesTME in the provided docker container? Using the provided nextflow pipeline?

If not using the provided nextflow pipeline can you show me the verbatim command you ran and the full stacktrace that resulted? (Please attach file with all console output, even if it is 1000s of lines, it's helpful)

Thanks,

Jeff

@shashwatsahay
Copy link
Author

Hi

I think there was a misunderstanding its a 40,000 spot and 3,000 gene space matrix

The memory was on a single machine. The program was run via the command line interface inside a conda environment as described in (https://github.com/tansey-lab/bayestme/blob/main/bayestme.conda.yml)

There was no error stacktrace, it was just killed by the slurm job monitor after it exceeded the memory requested.

The command used is

bleeding_correction --adata adata.h5ad --adata-output dataset_filtered_corrected.h5ad --bleed-out bleed_correction_results.h5 --n-top 50 --max-steps 5 --verbose

The stacktrace that comes out

2023-09-15 19:16:51,405 - bayestme.bleeding_correction - INFO - Fitting basis functions to first 50 genes
Fitting bleed correction basis functions:   0%|                                                                                                                                            | 0/5 [00:00<?, ?it/s]2023-09-15 19:16:51,705 - bayestme.bleeding_correction - INFO -
Step 1/5
Killed

@jeffquinn-msk
Copy link
Contributor

jeffquinn-msk commented Sep 19, 2023

Oh wow there's 40,000 spots, ok that explains it! The algorithm has O(N^2) memory complexity in the number of spots. It would need over 500GB for the basis function tensors alone for that many spots.

What platform are you using? Visium 10x (at least all the data I have seen so far) should have 5000 spots maximum per slide, is this a different platform?

@shashwatsahay
Copy link
Author

I am using slideSeq for this approximately based on mathematical calculation we should have 80,000 beads but we dont get the entirity of the beads due to microscopy or bead sythesization error

@jeffquinn-msk
Copy link
Contributor

Bleeding correction is probably only useful for Visium10x platform, I'd recommend running bayestme without bleeding correction on your data. Bleeding correction is an optional step in the pipeline.

@tansey
Copy link
Contributor

tansey commented Jan 31, 2024

If you want to run it with that many spots, you could cut the number of genes down to 10 or so. We typically run it on Visium with 50 genes but don't notice much difference at 10-20.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants