GitHub

Used download_koterniak_2020.sh to download the SRR listed in the metadata table. Note, also downloaded the Kaletsky (2018) data (the Koterniak data was downloaded with the older fastq-dump, the Kaletsky data with fasterq-dump, as some of the bigger files failed to download). However, most of the Kaletsky (2018) files are single-end reads, not processing them here (maybe later version).

Used star_index.sh to create index (saved along with references), and dsq_star_align.sh to align. Note, did not use option to sort BAMs as it kept running out of memory. The dSQ jobs are defined in joblist_align and re_joblist_align (bigger samples that need more time). Job was then created with:

dsq --job-file src/joblist_align.txt --mem 20GB --cpus-per-task 10 -t 5:00:00 --mail-type ALL
dsq --job-file src/re_joblist_align.txt --mem 20GB --cpus-per-task 10 -t 23:50:00 --mail-type ALL

After this, bams were sorted and indexed with the dSQ sort_bams.sh and joblist_sortindex.txt, using:

dsq --job-file src/joblist_sortindex.txt --mem 20GB --cpus-per-task 5 -t 23:50:00 --mail-type ALL

The SJ files created by STAR and the sorted indexed bams were transferred and renamed manually based on description in GEO:

Accession	Description	Run	Short Name
GSM2836730	muscle_TRAP_rep_1	SRR6238092	muscle_6238092
GSM2836731	muscle_TRAP_rep_2	SRR6238093	muscle_6238093
GSM2836732	intestine_TRAP_rep_1	SRR6238094	intestine_6238094
GSM2836733	intestine_TRAP_rep_2	SRR6238095	intestine_6238095
GSM2836734	neuronal_TRAP_rep_1	SRR6238096	neurons_6238096
GSM2836735	neuronal_TRAP_rep_2	SRR6238097	neurons_6238097
GSM2836736	serotonin_TRAP_rep_1	SRR6238098	serotonergic_6238098
GSM2836737	serotonin_TRAP_rep_2	SRR6238099	serotonergic_6238099
GSM2836738	dopamine_TRAP_rep_1	SRR6238100	dopaminergic_6238100
GSM2836739	dopamine_TRAP_rep_2	SRR6238101	dopaminergic_6238101

note we don't continue processing samples SRR6238102-6238111 here as they are the input (whole worm) for each of these samples.

Stringtie quantification with src/stringtie.sh, then export the TPMs with summarize_stringtie_q.R (ran manually on cluster). That gives us the intermediates/240827_strq_outs/240828_tx_TPM.tsv file.

Manually deleted the first header transcript_id\t so that the header starts with sample names.

Finally, run src/suppa_psi.sh to get PSI per event and src/suppa_dpsi.sh for deltaPSI, analyze in repo suppa_events along with the neuronal quantifications.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
metadata		metadata
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
reanalysis_tissue.Rproj		reanalysis_tissue.Rproj
software_version.md		software_version.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

License

cengenproject/reanalysis_tissue

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages