CIA Transcriptome Assembly

Snakefile pipeline of all the steps taken to reconstruct the CIA transcriptome assembly as in (Alfonso-Gonzalez, 2022). This pipeline is based on Drosophila genome. Input should be Drosophila based data.

Requirements

All package dependencies are downloaded using conda, with the exception of SQANTI and FLAIR which are installed during the pipeline run using specific github commits.

Before running, make sure you have conda installed, then run conda create -n snakemake-cia -c conda-forge -c bioconda -c defaults snakemake to create the pipeline environment. Then run snakemake with --use-conda to automatically create the environments during the pipeline run.

Briefly, dependencies are listed here.

Packages dependencies:

R dependencies:

R Bioconductor:

Before you run

Edit config/config.yaml to reflect the parameters you would like to use to run the pipeline, as well as config/units.tsv to specify the sample path and sample_type -- one of flam-seq, iso-seq, ont-cdna, or ont-direct. Data files can be gzipped or raw FASTA or FASTQ files.

Run

Modify the snakemake command in run.sh to use parameters that are appropriate for your computing or cluster environment. The pipeline uses conda, so be sure to include --use-conda in the snakemake command.

Execute ./run.sh.

Testing

You can download a test dataset, for which the config/units.tsv is already configured, from Zenodo here. Run tar -xzvf test.tar.gz in this directory and run the pipeline using ./run.sh.

Troubleshooting

Occasionally there can be an issue installing R packages in the cia-sqanti environment. This will manifest in an error like this:

Error in if (nzchar(SHLIB_LIBADD)) SHLIB_LIBADD else character() : 
argument is of length zero

if you run into this error, follow the instructions from this thread.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
R		R
config		config
data		data
db		db
docs		docs
envs		envs
metaCIAsubsampling		metaCIAsubsampling
scripts		scripts
.gitignore		.gitignore
CIAassembly_pipeline		CIAassembly_pipeline
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
environment.yaml		environment.yaml
run.sh		run.sh
test.sh		test.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CIA Transcriptome Assembly

Requirements

Packages dependencies:

R dependencies:

R Bioconductor:

Before you run

Run

Testing

Troubleshooting

About

Releases

Packages

Languages

License

cag1343/CIAtranscriptome_assembly

Folders and files

Latest commit

History

Repository files navigation

CIA Transcriptome Assembly

Requirements

Packages dependencies:

R dependencies:

R Bioconductor:

Before you run

Run

Testing

Troubleshooting

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages