Skip to content

Collection of GenePattern modules that may be used for processing of data sequencing data from CRISPR functional screens.

Notifications You must be signed in to change notification settings

cbirger/GP-CRISPR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 

Repository files navigation

GP-CRISPR"

Below is an outline of the workflow for processing the read data generated by a CRISPR functional screens.

You start with the following data files:

  • a reference csv (e.g., reference.csv) containing the sgRNA sequences and identifiers
  • multiple fastq (or, if compressed, fastq.gz) files. In the case of a single-sgRNA CRISPR functional screen, one fastq file per sample. In the case of a dual-sgRNA CRISPR functional screen, two fastq files (forward reads, and reverse reads) per sample.

(1) Use CRISPR.sgRNA_create_ref_fasta to generate a multi-record fasta file containing the sgRNA sequences.

(2) Create an alignment index from this reference fasta file. The type of index depends on your choice of aligners (e.g., Bowtie1 or Bowtie2).

For each fastq (or fastq.gz) file...

(3) Use CRISPR.sgRNA_read_trimmer to trim the fastq files. The trimmed files will ony contain the sgRNA sequence reads.

(4) Align the trimmed fastq files to the reference fasta file (using the index created earlier). When aligning the foward reads, be sure to only align to the forward strand (e.g., for both Bowtie1 and Bowtie2, use the --norc command line option). When aligning the reverse reads, be sure to only align to the reverse complement (e.g., "Crick") strand (e.g., for both Bowtie1 and Bowtie2, use the --nofw command line option).

(5) Once you have created all of the SAM files, use CRISPR.single_sgRNA_count or CRISPR.dual_sgRNA_count to summarize sgRNA counts. In the case of a single-sgRNA CRISPR screen, use CRISPR.single_sgRNA_count to summarize the alignments of each SAM file, creating a csv file per sample. In the case of dual-sgRNA CRISPR screen, use CRISPR.dual_sgRNA_count to summarize alignments of paired files. You will provide CRISPR.dual_sgRNA_count with two SAM files, one containing alignments for the forward reads, the other containing alignments for the reverse reads. You must sort these SAM files by queryname (use Picard.SortSam) before using them in CRISPR.dual_sgRNA_count.

(6) Once you have created all of the per-sample csv files, use CRISPR.combine_csv_files to combine them into a single csv-formatted dataset.

About

Collection of GenePattern modules that may be used for processing of data sequencing data from CRISPR functional screens.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published