Create files to display the splicing data in the genome browser.
There are 3 inputs: the GFF, a list of samples to exclude, and the bams. The outputs are bigwig (for exonic reads) and bed (for splice junctions) files that can be loaded into a JBrowse2 browser.
The input sequencing must be first processed with the bulk_align
pipeline, to yield the aligned BAMs.
The "master script" is src/generate_browser_data.sh
, which:
- in
R/sj_to_bed
, takes the SJ.out.tab files generated by STAR, generates bigBed files with the counts - if needed, merge the BAMs (not needed in bsn12), remove multimappers ( bsn9 uses
stringtie_quantif/src/prep_alignments.sh
, need update) - in
R/bam_to_bigwig
, takes the combined bams, generates bigWig files with exonic counts - exports everything in a
xxx_browser.tar.gz
archive, to be transferred to the genome browser (seesplicing_website
)
Beforehand, run the bulk_align
pipeline.
In older versions (with bsn9), we'd use the "augmented"" annotation created in stringtie_quantif
, now (with bsn12) we directly use the Wormbase annotation.
If needed, update the data/outliers_to_ignore.txt
list based on QC.
Edit the parameters at the beginning of src/generate_browser_data.sh
, and run it. If everything goes well, the slurm log ends with e.g.
Send to vps with:
scp /home/aw853/ycga_project/splicing_browser/data/outs/231121_browser.tar.gz cengen-vps:/var/www/public_data/splicing
use this command to upload, and see repo splicing_website
for next steps.