Skip to content

Latest commit

 

History

History
77 lines (62 loc) · 2.63 KB

README.md

File metadata and controls

77 lines (62 loc) · 2.63 KB
Table of Contents
  1. About The Project
  2. Prerequisites
  3. Installation
  4. Usage
  5. Configuration

Contributors

@sfpacman

About The Project

This repository contains a standalone Python script designed to extract information from Illumina binary QC files and convert to a YAML file. The script serves as a refactored replacement for illuminate module and uses InterOp module, which was subsequently integrated into the whole-exome sequencing (WES) pipeline during my tenure at UCSF.

Prerequisites

conda can be used to install interop and pandas .

Installation

  1. Clone the repo
    git clone https://github.com/sfpacman/Read_InterOp_illumina/
  2. Install packages via conda
    conda install bioconda::illumina-interop
    conda install pandas

You are now ready to run the script!

Usage

Execute the Python script in the terminal:

 python run_qc_yaml_interop_production.py <target_dir> <out_dir>

Input

Provide a directory containing RunInfo.xml and an InterOp subdirectory containing Illumina binary files

Output

A yaml file contains the following QC metrics:

  • lane_level_metrics
  • xread_level_metrics
  • read_level_metrics
  • read_yield_metrics
  • sample_level_metrics
  • run_level_metrics

Configuration

No additional arguments are included for modifying Illumina QC column names and metric conversion for the final report as the format is strictly defined. However, you can simply change the implenetation for the following functions.

  • get_columns_name()
  • get_metrics()

Consider implementing a YAML configuration parsing function in the future for enhanced flexibility.