Below is an outline of the workflow for processing the read data generated by a CRISPR functional screens.
You start with the following data files:
- a reference csv (e.g., reference.csv) containing the sgRNA sequences and identifiers
- multiple fastq (or, if compressed, fastq.gz) files. In the case of a single-sgRNA CRISPR functional screen, one fastq file per sample. In the case of a dual-sgRNA CRISPR functional screen, two fastq files (forward reads, and reverse reads) per sample.
(1) Use CRISPR.sgRNA_create_ref_fasta to generate a multi-record fasta file containing the sgRNA sequences.
(2) Create an alignment index from this reference fasta file. The type of index depends on your choice of aligners (e.g., Bowtie1 or Bowtie2).
For each fastq (or fastq.gz) file...
(3) Use CRISPR.sgRNA_read_trimmer to trim the fastq files. The trimmed files will ony contain the sgRNA sequence reads.
(4) Align the trimmed fastq files to the reference fasta file (using the index created earlier). When aligning the foward reads, be sure to only align to the forward strand (e.g., for both Bowtie1 and Bowtie2, use the --norc command line option). When aligning the reverse reads, be sure to only align to the reverse complement (e.g., "Crick") strand (e.g., for both Bowtie1 and Bowtie2, use the --nofw command line option).
(5) Once you have created all of the SAM files, use CRISPR.single_sgRNA_count or CRISPR.dual_sgRNA_count to summarize sgRNA counts. In the case of a single-sgRNA CRISPR screen, use CRISPR.single_sgRNA_count to summarize the alignments of each SAM file, creating a csv file per sample. In the case of dual-sgRNA CRISPR screen, use CRISPR.dual_sgRNA_count to summarize alignments of paired files. You will provide CRISPR.dual_sgRNA_count with two SAM files, one containing alignments for the forward reads, the other containing alignments for the reverse reads. You must sort these SAM files by queryname (use Picard.SortSam) before using them in CRISPR.dual_sgRNA_count.
(6) Once you have created all of the per-sample csv files, use CRISPR.combine_csv_files to combine them into a single csv-formatted dataset.