Skip to content

Workflow to download, process, and explore microbial RNA-seq data from NCBI SRA

License

Notifications You must be signed in to change notification settings

SBRG/iModulonMiner

 
 

Repository files navigation

iModulonMiner

This repository presents a computational workflow to compute and characterize all iModulons for a selected organism. This occurs in five steps:

  1. Gather all publicly available RNA-seq data for the organism (Step 1)
  2. Process the RNA-seq data (Step 2)
  3. Inspect data to identify high-quality datasets (Step 3)
  4. Compute iModulons (Step 4)
  5. Characterize iModulons using PyModulon (Step 5)

Background

iModulons are independently-modulated group of genes that are computed through Independent Component Analysis (ICA) of a gene expression dataset. To learn more about iModulons or explore published iModulons, visit iModulonDB or see our publications for Escherichia coli, Staphylococcus aureus, or Bacillus subtilis.

Here, we introduce the concept of the Modulome for an organism, which is the set of all iModulons that can be computed for the organism based on publicly available RNA-seq data. The computational pipeline provides a step-by-step workflow to compute the Modulome for Bacillus subtilis.

Setup

Docker

We have provided pre-built Docker containers with all necessary software.

To begin, install Docker and Nextflow.

Local installation

You can also run each program locally, with all requirements listed in the conda environment.yml file. For Step 5 (Characterized iModulons), additionally install pymodulon.

Cite

Please cite the following paper: iModulonMiner and PyModulon: Software for unsupervised mining of gene expression compendia

About

Workflow to download, process, and explore microbial RNA-seq data from NCBI SRA

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • HTML 72.1%
  • Jupyter Notebook 27.5%
  • Python 0.2%
  • Nextflow 0.1%
  • PostScript 0.1%
  • Shell 0.0%