nf-cellranger-tools

Collection of Nextflow tools for using CellRanger

Tools

mkfastq

Docs: Generating FASTQs with cellranger mkfastq

Parameters

bcl_run_folder: Folder containing the Illumina sequencer's base call files
samplesheet: Sample sheet defining data structure, either a Illumina Experiment Manager sample sheet or a simple three-column CSV
output: Path for output files
input_type: Define the format of the sample sheet input, either "samplesheet" or "csv" (default: "samplesheet")
filter_dual_index: Optional. Only demultiplex samples identified by i7/i5 dual-indices (e.g., SI-TT-A6), ignoring single-index samples. Single-index samples will not be demultiplexed
filter_single_index: Optional. Only demultiplex samples identified by an i7-only sample index, ignoring dual-indexed samples. Dual-indexed samples will not be demultiplexed
lanes: Comma-delimited series of lanes to demultiplex (e.g. 1,3). Use this if you have a sample sheet for an entire flow cell but only want to generate a few lanes for further 10x Genomics analysis. (optional)
use_bases_mask Same meaning as for bcl2fastq. Use to clip extra bases off a read if you ran extra cycles for QC.
delete_undetermined Delete the Undetermined FASTQs generated by bcl2fastq. Useful if you are demultiplexing a small number of samples from a large flow cell.
barcode_mismatches Same meaning as for bcl2fastq. Use this option to change the number of allowed mismatches per index adapter (0, 1, 2). Default: 1.
project Custom project name, to override the sample sheet or to use in conjunction with the --csv argument.

count

Docs: Single-Library Analysis with cellranger count

Parameters

output: Path for output files
sample_whitelist: Optional text file used to specify the subset of samples to process, one per line (no header)
fastq_dir: Directory containing all FASTQ files
transcriptome_dir: Directory containing transcriptome reference files (see below)

References

The default transcriptome reference in the workflow is:

/shared/biodata/reference/10x/refdata-gex-GRCh38-2020-A

VDJ

Docs: Analysis of V(D)J data

Parameters

output: Path for output files
sample_whitelist: Optional text file used to specify the subset of samples to process, one per line (no header)
fastq_dir: Directory containing all FASTQ files
vdj_dir: Directory containing VDJ reference files (see below)

References

The default VDJ reference in the workflow is:

/shared/biodata/reference/10x/refdata-cellranger-vdj-GRCh38-alts-ensembl-5.0.0

multi

To analyze samples which have been prepared with multiple complementary methodologies, the flexible cellranger multi analysis module is used.

Sample Grouping

To account for a wide variety of experimental designs, the multi workflow in this repository uses a simple input format which lists each of the different libraries in a single table alongside the methodology which was used to prepare it.

For example, a single sample (sc5p_v2_hs_B_1k) may have been prepared in two parallel methods, both with 5' gene expression (sc5p_v2_hs_B_1k_5gex) and V(D)J (sc5p_v2_hs_B_1k_b). The sample grouping table describing this experimental design would be:

sample	grouping	feature_types
sc5p_v2_hs_B_1k_5gex	sc5p_v2_hs_B_1k	Gene Expression
sc5p_v2_hs_B_1k_b	sc5p_v2_hs_B_1k	VDJ

Allowed values for feature_types are (ref):

Gene Expression
VDJ
VDJ-T
VDJ-B
Antibody Capture (see below)
CRISPR Guide Capture (see below)
Multiplexing Capture (see below)

Note that the sample grouping table must be provided in CSV format.

Antibody or CRISPR Guide Capture

When analyzing a sample using antibody capture or CRISPR guide capture, a feature reference CSV must be provided using the formation described here.

Note that either antibody capture or CRISPR guide capture may be analyzed, but not both at the same time.

Multiplexing Capture

Optionally, if CMOs were used to multiplex samples in a single GEM the column sample would be omitted from the sample grouping table and a second table would be provided indicating the mapping of samples to Cell Multiplexing oligo IDs in this library. If multiple CMOs were used for a sample, separate IDs with a pipe (e.g., CMO301|CMO302).

An example CMO mapping table would look like:

sample_id	cmo_ids
Jurkat	CMO301
Raji	CMO302

When using multiplexing capture, the sample grouping table must contain a library with the feature_types annotated as Multiplexing Capture, e.g.

library	feature_types
sc5p_v2_hs_B_1k_5gex	Gene Expression
sc5p_v2_hs_B_1k_mux	Multiplexing Capture

Parameters

output: Path for output files
grouping: Path to sample grouping CSV
fastq_dir: Directory containing all FASTQ files
transcriptome_dir: Directory containing transcriptome reference files
vdj_dir: Directory containing VDJ reference files
multiplexing: Path to multiplexing capture table (optional)
feature_csv: Feature Reference CSV used for either Antibody Capture or CRISPR Guide Capture (optional)

Note that both transcriptome_dir and vdj_dir must always be specified, although the contents will only be accessed if the corresponding sample type is provided

Resource Allocation

The amount of CPUs and memory available to each task can be customized with the parameters -process.cpus (default: 16) and -process.memory (default: 64.GB)

Testing

The workflows in this repository may be tested by downloading example datasets hosted by 10X and running the appropriate analyses locally.

To download the example datasets and all necessary reference data, navigate to test/ and run download_inputs.sh.

Before running the tests, make sure that Nextflow is installed on your host system.

NOTE: The CellRanger utility is sourced by default from an EasyBuild module which is assumed to be available on the host system (using beforeScript = "ml CellRanger/6.1.1"). If CellRanger is available from another source, it can be loaded for the testing suite by adding an appropriate configuration file (nextflow.config) to the working directory used for testing.

To run tests, navigate to test/ and run bash run.sh.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

nf-cellranger-tools

Tools

mkfastq

Parameters

count

Parameters

References

VDJ

Parameters

References

multi

Sample Grouping

Antibody or CRISPR Guide Capture

Multiplexing Capture

Parameters

Resource Allocation

Testing

Files

README.md

Latest commit

History

README.md

File metadata and controls

nf-cellranger-tools

Tools

mkfastq

Parameters

count

Parameters

References

VDJ

Parameters

References

multi

Sample Grouping

Antibody or CRISPR Guide Capture

Multiplexing Capture

Parameters

Resource Allocation

Testing