Skip to content

bm2-lab/SCMMI_Benchmark

Repository files navigation

Single-cell Multi-modal Integrations Benchmark (SCMMIB)

SCMMIB introduction

SCMMIB project provided a benchmark workflow for evaluating the usability, accuracy, robustness and scalability of single-cell multimodal integration algorithms, including 65 single-cell multi-modal integration methods in 40 algorithms involving modalities of DNA, RNA, protein and spatial multi-omics for paired integration, unpaired diagonal integration, and unpaired mosaic integration.

SCMMIB workflow

Workflow

SCMMIB package

We developed a python package scmmib based on scanpy pipeline, which referred to some integration metrics in scib and scglue package, and extended to different single-cell multimodal integration tasks.

The knn_smooth function in scmmib package was sourced from a public knn smoothing method: knn_smoothing paper, and github.

User tutorial and api documentations can be found in an online document: (https://scmmib.readthedocs.io/en/latest/).

scmmib package also includes a simplified summary visualization tool plot_scmmib_table.r in R.

Dependencies

  • Python >=3.8, scib, scglue, scanpy for scmmib python package.
  • R >=3 and dplyr, scales, ggimage,ggplot2, cowplot for plot_scmmib_table.r R tool.

Installation

  1. Preparing the envrionment.
  • Option 1: install dependencies by pip.
  • for example, for python package, python dependencies can be install with pip:
# pip install scib scglue scanpy # install main dependencies to an existing environment.
pip install -r pip_requirement.txt # install all python dependencies with fixed version
  • Option 2: use a new conda env of mixture dependencies (stable).
    Th conda tool (miniconda) can be installed from anaconda website.
    Then create and enter the conda environment.
conda env create -f scmmib_env.yml
conda activate scmmib
  1. Install scmmib package.
# download SCMMIB
git clone https://github.com/bm2-lab/SCMMI_Benchmark
# set dir to folder
cd SCMMI_benchmark
pip install .
  1. Test the installation in python
import scmmib

FAQ:

  • A bug may occur for graph LISI metrics as follows:
FileNotFoundError, [Errno 2] No such file or directory: '/tmp/lisi_svo3el2i/graph_lisi_indices_0.txt'

The related GitHub issue in scib project is here and a posssible solution.

A simplified summary visualization tool plot_scmmib_table.r

The plot_scmmib_table.r is a simplified version of summary table visualization tool, which is adapted from both funkyheatmap package and scib_knit_table function in scib package, as these two tools requires complex input format and numerous input format restrictions.
A demo output: table

The plot_scmmib_table.r can be used alone with input of simple R data.frame format. All summary figures were generated with plot_scmmib_table.r tool.

We provided a demo noteook and reference manual for using plot_scmmib_table.r.

More examples can be referred in figure reproducibility code.

Benchmark datasets summary

All datasets analyzed in SCMMIB study are listed below. Details of these datasets were introduced in our stage1 manuscript. The processed datasets are available in a public Figshare repostiory link.

Dataset name Multi-omics Batches Species Number of cells sample/tissue type
BMMC Multiome scRNA + scATAC 12 donors from 4 sites Human 69,249 bone marrow mononuclear cells
BMMC CITE-seq scRNA + ADT 12 donors from 4 sites Human 90,261 bone marrow mononuclear cells
HSPC Multiome scRNA + scATAC 4 donors of 5 time points Human 105,942 hematopoietic stem and progenitor cells
HSPC CITE-seq scRNA + ADT 4 donors of 5 time points Human 70,988 hematopoietic stem and progenitor cells
SHARE-seq skin scRNA + scATAC - Mouse 34,774 skin
COVID19 CITE-seq scRNA + ADT 143 donors Human 781,123 peripheral blood immune cells
10X PBMC scRNA + scATAC 2 samples Human 15,021 peripheral blood immune cells
10X Mouse Brain scRNA + scATAC 2 replicates for 2 samples Mouse 12,138 brain
Human white blood cell scRNA + ADT 8 donors of 3 time points Human 161,764 white blood cell
10X NSCLC scRNA + ADT 2 replicates Human 15,618 NSCLC
10X kidney cancer scRNA + ADT 7 donor Human 20,974 Kidney
Lymph node spatial spatial+scRNA+ADT 2 samples Human 6,843 lymph node
Thymus spatial spatial+scRNA+ADT 4 samples Mouse 17,824 thymus
Spleen SPOTS spatial+scRNA+ADT 2 samples Mouse 5,336 spleen

Benchmark Methods

All benchmark methods analyzed in SCMMIB study are listed below. Details of these methods were available in our Register Report Stage 1 manuscript in figshare folder.

Method Article Time
Liger iNMF Cell 2019
Seurat v3 CCA Cell 2019
Seurat v4 RPCA Cell 2019
Citefuse Bioinformatics 2020
MOFA+ Genome Biology 2020
scAI Genome Biology 2020
unionCom Bioinformatics 2020
cobolt Genome Biology 2021
DCCA Bioinformatics 2021
GLUE Nature Biotechnology 2021
Liger online iNMF Nature Biotechnology 2021
scDEC Nature machine intelligence 2021
scMM Cell Reports Methods 2021
scMVAE Briefings in Bioinformatics 2021
Seurat v4 WNN Cell 2021
totalVI Nature Methods 2021
MultiMAP Genome Biology 2021
bindSC Genome Biology 2022
Liger UiNMF Nature Communications 2022
Multigrate bioRxiv 2022
Pamona Bioinformatics 2022
SAILERX Nucleic Acids Research 2022
SCALEX Nature Communications 2022
sciPENN Nature machine intelligence 2022
scMDC Nature Communications 2022
scMVP Genome Biology 2022
uniPort Nature Communications 2022
scVAEIT PNAS 2022
MEFISTO Nature Methods 2022
DeepMAPS Nature communications 2023
GCN-SC Briefings in Bioinformatics 2023
Maxfuse Nature Biotechnology 2023
MultiVI Nature Methods 2023
scMCs Bioinformatics 2023
SIMBA Nature Methods 2023
Seurat v5 bridge Nature Methods 2023
Stabmap Nature Methods 2023
scMoMaT Nature communications 2023
SpatialGlue Nature Methods 2024
MIDAS Nature Biotechnology 2024

Related SCMMIB manuscript

Our stage1 manuscript "Benchmarking single-cell multi-modal data integrations." was public in Nature Methods register report figshare folder in links.

Our stage2 manuscript was submitted.

Citation

Stage 1 manuscript

Fu, Shaliu; Wang, Shuguang; Si, Duanmiao; Li, Gaoyang; Gao, Yawei; Liu, Qi (2024). Benchmarking single-cell multi-modal data integrations. figshare. Journal contribution. https://doi.org/10.6084/m9.figshare.26789572.v1

Datasets

SCMMIB project processed datasets. figshare. Dataset. https://doi.org/10.6084/m9.figshare.27161451.v2

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published