Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scanorama from this repo vs scanpy external #140

Open
mortunco opened this issue Jan 25, 2023 · 0 comments
Open

scanorama from this repo vs scanpy external #140

mortunco opened this issue Jan 25, 2023 · 0 comments

Comments

@mortunco
Copy link

mortunco commented Jan 25, 2023

Hi,

Thank you very much for the tool. I am very interested using this tool as the only label we have in the data is the dataset names (~samples). The remaining labels we have is somewhat arbitrary. I would like to try batch correction to eliminate the dataset differences. I am aware that our biases are complex such as different cancer + different lab, but I just wanna give a try. I apologise if its too much stuff to correct.

I have a dataset of normalized values. I previously merged this dataset into a single anndata object. We use "dataset" column for the batch labeling. In this data I have 27k cells.

I am little bit confused about doing batch corrections with scanorama.

method 1

Using https://scanpy.readthedocs.io/en/stable/generated/scanpy.external.pp.scanorama_integrate.html#

Load the object.
Normalize and PCA.
Add batch measure.
Integrate.

import scanpy as sc
import scanpy.external as sce
adata = sc.datasets.pbmc3k()
sc.pp.recipe_zheng17(adata)
sc.tl.pca(adata)
adata.obs['batch'] = 1350*['a'] + 1350*['b']
sce.pp.scanorama_integrate(adata, 'batch')
'X_scanorama' in adata.obsm

method 2

https://github.com/brianhie/scanorama#troubleshooting
Join all adatas into a list.
Integrate. Batch correct

# List of datasets:
adatas = [ list of scanpy.AnnData ]
import scanorama
# Integration.
scanorama.integrate_scanpy(adatas)
# Batch correction.
corrected = scanorama.correct_scanpy(adatas)
# Integration and batch correction.
corrected = scanorama.correct_scanpy(adatas, return_dimred=True)

Question 1.

  • In the tutorial for method2, I always see creating the list of AnnDatas. Given I have already have my cells in a single anndata object do I have to split it again?
  • Method2 does not have a batch key different than sce.pp.scanorama_integrate(adata, 'batch') . Should I be worried about this?
  • Method1 does not have correction? Should I add correct_scanpy to Method1?

Thank you very much for your help,

Best regards,

Tunc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant