CHANGES.txt

# VERSION 1.0

- extra pipelines: list wiht bpipe-config -vp
- trimmomatic for single reads (current dir and project)

# VERSION 0.8

- index html for pipeline
- force skip samplesheets check (-s)
- new pipelines: XHMM, VARIANTS from ALLELES
- send test
- branch variables:
   Once set, a branch variable becomes referenceable directly by all subsequent stages that are part of the branch, including any child branches. The value can be referenced as a property on the branch object, for example 'branch.planet', but can also be referenced without the branch. prefix.
- removed /dev/shm  (optional)
- check on global vars (fail if is not defined) with requires


# Version 0.6

- CENTRALIZED DEFAULT PATHS 
   the module: default_paths_gfu.groovy contains the SOFTWARE PATHS used currently by bpipe in ALL our modules
   (location: /lustre1/tools/libexec/bpipeconfig/modules/default_paths_gfu.groovy). You can override the SOFTWARE PATHS definition in your pipeline.groovy file

- pipelines reorganized by type of input (exome, genome and rnaseq)

- Quality Control pipelines are now in subgroup “quality_control"

- exome pipelines: 
  * EXOME PIPELINES use TARGET_INTERVALS (nextera rapid capture targets as default)
  * EXOME VARIANTS CALLING and BAM RECALIBRATION use by default HealtyExomes in /lustre1/workspace/Stupka/HealtyExomes/
  * EXOME VARIANTS ANNOTATION: remove by default HealtyExomes from the vcf file  
  * EXOME ALIGN PROJECT now generate an HsMetrics report in the doc subdirectory (see quality_control)

- genome pipelines:
  * DON’T USE TARGET INTERVALS
  * VARIANTS CALLING and BAM RECALIBRATION DON’T USE by default HealtyExomes in /lustre1/workspace/Stupka/HealtyExomes/ (but can be changed in the pipeline file)

- variants annotation (exome and genome):
  * upgraded dbSNFSP annotation (current: dbNSFP2.4.txt.gz)

- quality_control:
  * new pipeline hsmetrics, calculate hs metrics of a set of bam files (for columns definition see: http://picard.sourceforge.net/picard-metric-definitions.shtml#HsMetrics)
  * fastqc_project pipeline: 
    Works like the align_project pipelines, You have a raw_data directory with fastq.gz reads (after demultiplex) and a scratch directory to perform analysis:
       /lustre2/raw_data/RUN_NAME/PINAME_PROJECTID_PROJECTNAME
       /lustre2/scratch/PINAME/PROJECTID_PROJECTNAME

    You launch the pipelines FROM the /lustre2/scratch/ directory pointing to /lustre2/raw_data/ directory:
  cd /lustre2/scratch/PINAME/PROJECTID_PROJECTNAME
  bpipe-config pipe fastqc_project /lustre2/raw_data/RUN_NAME/PINAME_PROJECTID_PROJECTNAME
   
  * THIS CONFIGURATION (working dir: scratch, data dir: raw_data) avoid pollution of the raw_data dir with bpipe intermediate files (everything is done the scratch dir)

bpipe 0.9.8.6_beta_4.gfu:
- send, check and fail

bpipe-config new commands:
- project

With the project command you can launch the same project pipeline on multiple projects.

Example:

SCENARIO: You have just demultiplexed the RUN140 in  /lustre2/raw_data/140211_SN859_0140_AD2A9EACXX and you want to launch the fastqc_project pipeline on all the Projects in the RUN 140

COMMANDS:

> mkdir /lustre2/scratch/RUN140_fastqc
> cd /lustre2/scratch/RUN140_fastqc
> bpipe-config project fastqc_project /lustre2/raw_data/140211_SN859_0140_AD2A9EACXX/Project_*
> ./runner.sh

A fastqc project pipeline will be launched for each Project dir in the RUN140


- smerge 

This command with this cool name merge SampleSheet.csv from different projects


# Version 0.5 

More pipelines: 

- soapsplice_submit_project IOS 009
- exome_variants_calling  IOS 005
- genome_variants_calling IOS 015
- variants_annotation     IOS 005

* prototype for medip pipeline

The new fastqc project pipeline generate a better fastqc, with an index page to navigate samples:

Example: /lustre1/QC-Illumina/140211_SN859_0140_AD2A9EACXX/Stupka_10_RareDisease_fastqc_report/index.html


# Version 0.4

* PROJECT PIPELINES: 
    More or less is like the One Ring of Sauron: one bpipe to run them all.
    in bpipe-config -p (pipelines list) you will notice a new category: illumina_projects. 
    There you will find PROJECT pipepelines that parallelize on multiple samples. 
    If classic pipelines create a bpipe instance for each sample, the project pipelines create a single JVM and bpipe instance.

    Usage example in our server:

    1. You have some raw_data located in:
       /lustre2/raw_data/RUN_NAME/PINAME_PROJECTID_PROJECTNAME

       Es: /lustre2/raw_data/131212_SN859_0138_AC2PGHACXX/Project_Kajaste_80_LVTranscription

    2. You have to make analysis and generated intermediate file in:
        /lustre2/scratch/PINAME/PROJECTID_PROJECTNAME
        Es: /lustre2/scratch/Kajaste/80_LV_Transcription

    3. Enter "scratch dir" for your project:
       cd /lustre2/scratch/Kajaste/80_LV_Transcription

    4. Run bpipe-config from THERE pointing to samples in /lustre2/raw_data
       bpipe-config pipe bwa_submit_project /lustre2/raw_data/131212_SN859_0138_AC2PGHACXX/Project_Kajaste_80_LVTranscription/Sample_*
       bpipe run -r bwa_submit_project.groovy /lustre2/raw_data/131212_SN859_0138_AC2PGHACXX/Project_Kajaste_80_LVTranscription/Sample_*

    5. Project pipelines creates dirs in scratch for samples and link fastq files from original directory to scratch dir. By default output merged BAMs are in moved in directory "BAM"

* PIPES: pretend mode
  Due to compatibility issues with bpipe, I have changed the "test mode" in "pretend mode". In order to launch a whole pipeline in pretend mode (just create empty file locally, without launching any bioinformatics software) use:
    bpipe run -p pretend=true pipeline.groovy inputs.*

* command config: generate a bpipe.config resource file.

	1. Now don't require a SampleSheet.csv or a Project name
       by default the project name is USERNAME_DIRECTORY.
       Check of Project name still active for command pipe.

    2. Email Notification:
       now if your username is in a configuration list (config/email_list.groovy),
       bpipe will send you email notification when the pipeline is complete

* command pipe: generate a pipeline

	1. PROJECT PIPELINES:
	   SampleSheet.csv automatically created in project directory using info (validated) from sample dirs
	3. PIPELINE INFO:
	   Automatically generated in doc/pipeline_name.html

* command jvm: full info on JAVA VIRTUAL MACHINE CONFIGURATION

* command clean: remove .bpipe dirs from current dir or args (don't remove pipeline files)

* Logging:

	1. verbose option (less verbose by default)


* SampleSheet Vaidations (validateSample)

	1. Better validation for each sample in SampleSheet.csv during parsing