Skip to content

cag1343/CIAtranscriptome_assembly

Repository files navigation

CIA Transcriptome Assembly

Snakefile pipeline of all the steps taken to reconstruct the CIA transcriptome assembly as in (Alfonso-Gonzalez, 2022). This pipeline is based on Drosophila genome. Input should be Drosophila based data.

Requirements

All package dependencies are downloaded using conda, with the exception of SQANTI and FLAIR which are installed during the pipeline run using specific github commits.

Before running, make sure you have conda installed, then run conda create -n snakemake-cia -c conda-forge -c bioconda -c defaults snakemake to create the pipeline environment. Then run snakemake with --use-conda to automatically create the environments during the pipeline run.

Briefly, dependencies are listed here.

Packages dependencies:

R dependencies:

R Bioconductor:

Before you run

Edit config/config.yaml to reflect the parameters you would like to use to run the pipeline, as well as config/units.tsv to specify the sample path and sample_type -- one of flam-seq, iso-seq, ont-cdna, or ont-direct. Data files can be gzipped or raw FASTA or FASTQ files.

Run

Modify the snakemake command in run.sh to use parameters that are appropriate for your computing or cluster environment. The pipeline uses conda, so be sure to include --use-conda in the snakemake command.

Execute ./run.sh.

Testing

You can download a test dataset, for which the config/units.tsv is already configured, from Zenodo here. Run tar -xzvf test.tar.gz in this directory and run the pipeline using ./run.sh.

Troubleshooting

Occasionally there can be an issue installing R packages in the cia-sqanti environment. This will manifest in an error like this:

Error in if (nzchar(SHLIB_LIBADD)) SHLIB_LIBADD else character() : 
argument is of length zero

if you run into this error, follow the instructions from this thread.

About

Workflow CIA transcriptome assembly

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published