Quantum Barcoding : Ultra-High Throughput Single Cell Analysis of Proteins and RNAs by Split-pool Synthesis

Quantum Barcoding (QBC) is a method enabling simultaneous, ultra-high throughput single-cell-barcoding, of millions of cells for targeted single cell analysis of proteins and RNAs. This method circumvents the need to isolate single cells by building cell-specific oligo barcodes dynamically within each cell. Cell-specific codes are added to each tagged molecule within cells.

This analysis workflow reads in paired-end fastq files of a sequenced library built using of QBC_v1.0 barcodes, generates a merged fasta file using FLASH and seqtk, processes it through QBC-parse_v1.0 , filters and normalizes to generate a final output in FCS 4.0 standard Flow Cytometry format.

QBC-parse_v1.0 - Deduplication based on unique molecular identifiers (UMI) – were performed using the QBC-parse_v1.0 which allows alignment of sequences with one mismatch.

The QBC algorithm sequentially: a) detects barcode via alignments, b) corrects barcode by efficiently comparing barcodes to a whitelist, c) deduplicates based on UMI, d) evaluates reads for chimera filtering (check for evidence of PCR-based cross-over), e) filters reads for underrepresented/artificially created cells and f)transforms sequences into table of cells and markers.

The data is further normalized as follows: expression values Ei,j for marker i in cell j were calculated by dividing unique read counts for marker i by the sum of the marker counts in cell j, to normalize for differences in coverage. The output is a corrected expression matrix which is then used as input for fcs conversion.

For more information contact us at [email protected]

QBC_Single_Cell_Analysis_NGS

Description:

This is a driver bash script to process the fasta files generated by QBC assay. The output of the script is an fcs file which can be uploaded to standard flow-cytometry analysis software for further analysis.

Platform Requirement:

Linux.

Initial Setup:

Java: Install java version "1.8.0_92"
Miniconda: Install Miniconda3
Python3: Install Python 3.6.0

Usage:

git clone https://github.com/bioinform/QBC_Single_Cell_Analysis_NGS.git
cd QBC_Single_Cell_Analysis_NGS
cd apps/parser/
bash linux_build.sh #This command builds the parser.
cd ../../

Now using your favorite editor update the custom_job.txt file with the location of read1 and read2 fastq files, qdata folder and output folder.

Parameter	Explanation
`read1`	Folder location for read1 fastq file.
`read2`	Folder location for read2 fastq file.
`qdata_folder`	Folder location for qdata folder containing oligos.txt, AHCA_Codes.txt, SC_Codes.txt, Singlet_settings.txt files. Please refer to example/qdata folder in this repo for the format of these files.
`output_folder`	Folder location for storing output files.

To generate the processed output from the raw fastq files type the following command at your terminal and press enter.

nohup bash submit_bash.sh custom_job.txt &

Output Folder Structure:

The bash script generates an output folder with the following top-level folder structure:

out_folder/
└── Exp180G_S1_L001
    ├── flash_output
    │   ├── Exp180G_S1_L001.extendedFrags.fastq
    │   ├── Exp180G_S1_L001.hist
    │   ├── Exp180G_S1_L001.histogram
    │   ├── Exp180G_S1_L001.notCombined_1.fastq
    │   └── Exp180G_S1_L001.notCombined_2.fastq
    ├── seqtk_output
    |   └── Exp180G_S1_L001.extendedFrags.fasta
    ├── parser_output
    │   ├── Exp180G_S1_L001_LStr_RC(-).bad
    │   ├── Exp180G_S1_L001_LStr_RC(-).byFCS
    │   ├── Exp180G_S1_L001_LStr_RC(-).crossoverFilter
    │   ├── Exp180G_S1_L001_LStr_RC(-).junk
    │   ├── Exp180G_S1_L001_LStr_RC(-).metrics
    │   └── Exp180G_S1_L001_LStr.Statistics
    ├── rmsinglets_output
    │   ├── Exp180G_S1_L001_LStr_RC(-)_10_5_5_1.JITTERED_forFCS
    │   ├── Exp180G_S1_L001_LStr_RC(-)_10_5_5_1.tab_delimited
    │   ├── Exp180G_S1_L001_LStr_RC(-)_10_5_5_1.UNJITTERED_forFCS
    │   └── Summary.txt
    ├── fcs_output
    │   ├── Exp180G_S1_L001_LStr_RC(-)_10_5_5_1_JITTERED.fcs
    │   └── Exp180G_S1_L001_LStr_RC(-)_10_5_5_1_UNJITTERED.fcs
    └── norm_fcs_output
        └── Exp180G_S1_L001_LStr_RC(-)_10_5_5_1_normalized_filtered_Chi2Pval_1.0_jittered0.5.fcs

Files in `flash_output` folder

Exp180G_S1_L001.extendedFrags.fastq - File with merged reads in fastq format.
Exp180G_S1_L001.hist - Numeric histogram of merged read lengths.
Exp180G_S1_L001.histogram - Visual histogram of merged read lengths.
Exp180G_S1_L001.notCombined_1.fastq - Read 1 of mate pairs that were not merged.
Exp180G_S1_L001.notCombined_2.fastq - Read 2 of mate pairs that were not merged.

Files in `seqtk_output` folder

Exp180G_S1_L001.extendedFrags.fasta - File with merged reads in fasta format.

Files in `parser_output` folder

Exp180G_S1_L001_LStr_RC(-).bad - File containing reads where the parser encountered error in at least one sequence element (can be cell-barcodes or anchors or AHCA sequence).
Exp180G_S1_L001_LStr_RC(-).byFCS - A tab-delimited file with marker counts for each of the cells.
Exp180G_S1_L001_LStr_RC(-).crossoverFilter - File containing reads discarded by pcr-crossover filter.
Exp180G_S1_L001_LStr_RC(-).junk - File containing reads where the parser failed to detect all 7 barcode anchors.
Exp180G_S1_L001_LStr_RC(-).metrics - Metrics file containing reads counts filtered by each step in the parsing algorithm.
Exp180G_S1_L001_LStr.Statistics - A summary file containing statistics for cell-barcodes.

Files in `rmsinglets_ouput` Folder

Exp180G_S1_L001_LStr_RC(-)_10_5_5_1.tab_delimited - A tab-delimited file containing cells and their corresponding marker counts.
Exp180G_S1_L001_LStr_RC(-)_10_5_5_1.UNJITTERED_forFCS - A tab-delimited file containing marker counts only (cell-identifier is discarded).
Exp180G_S1_L001_LStr_RC(-)_10_5_5_1.JITTERED_forFCS - A tab-delimited file containing jittered marker counts (cell-identifier column is dicarded).
Summary.txt - A summary file generated by remove-singlets script.

Files in `fcs_output` Folder

Exp180G_S1_L001_LStr_RC(-)_10_5_5_1_UNJITTERED.fcs - Un-normalized fcs file containing marker counts for each of the cells.
Exp180G_S1_L001_LStr_RC(-)_10_5_5_1_JITTERED.fcs - Un-normalized fcs file containing jittered marker counts for each of the cells.

Files in `norm_fcs_output` Folder

Exp180G_S1_L001_LStr_RC(-)_10_5_5_1_forFCS_normalized_filtered_Chi2Pval_1.0_jittered0.5.fcs - Normalized and Jittered fcs file.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
apps		apps
example/qdata		example/qdata
LICENSE		LICENSE
README.md		README.md
custom_job.txt		custom_job.txt
environment.yml		environment.yml
qbc_conda_env.sh		qbc_conda_env.sh
submit_bash.sh		submit_bash.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Quantum Barcoding : Ultra-High Throughput Single Cell Analysis of Proteins and RNAs by Split-pool Synthesis

QBC_Single_Cell_Analysis_NGS

Description:

Platform Requirement:

Initial Setup:

Usage:

Output Folder Structure:

Files in `flash_output` folder

Files in `seqtk_output` folder

Files in `parser_output` folder

Files in `rmsinglets_ouput` Folder

Files in `fcs_output` Folder

Files in `norm_fcs_output` Folder

About

Releases 1

Packages

Contributors 2

Languages

License

bioinform/QBC_Single_Cell_Analysis_NGS

Folders and files

Latest commit

History

Repository files navigation

Quantum Barcoding : Ultra-High Throughput Single Cell Analysis of Proteins and RNAs by Split-pool Synthesis

QBC_Single_Cell_Analysis_NGS

Description:

Platform Requirement:

Initial Setup:

Usage:

Output Folder Structure:

Files in flash_output folder

Files in seqtk_output folder

Files in parser_output folder

Files in rmsinglets_ouput Folder

Files in fcs_output Folder

Files in norm_fcs_output Folder

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Files in `flash_output` folder

Files in `seqtk_output` folder

Files in `parser_output` folder

Files in `rmsinglets_ouput` Folder

Files in `fcs_output` Folder

Files in `norm_fcs_output` Folder

Packages