Releases: PacificBiosciences/HiFi-human-WGS-WDL
v2.0.7
Thanks for bearing with us through the last few releases as we addressed issues affecting specific backends and combinations of inputs. We should start to see more stability now.
What's Changed
- Testing: Added unit tests for all tasks (except
read_pbsv_splits
) using DNAstack/wdl-ci #157 - Fix:
write_ped_phrank
bug affectingsingleton.wdl
entrypoint in Cromwell.
Full Changelog: v2.0.6...v2.0.7
v2.0.6
This is a bugfix release.
What's Changed
- Revert SC1117 fixes that caused issues with miniwdl. by @williamrowell in #169
Full Changelog: v2.0.5...v2.0.6
v2.0.5
This patch primarily addresses an issue with provisioning write_ped_phrank on GCP and Terra.
- cleaned up documentation
- cleaned up input templates
- addressed shellcheck issues for several command calls
Full Changelog: v2.0.4...v2.0.5
v2.0.4
What's Changed
This release addresses an issue with how the workflow_version output string was generated. This was not detected by linting tools and only caused issues at runtime on Cromwell.
- v2.0.4 by @williamrowell in #156
Full Changelog: v2.0.3...v2.0.4
v2.0.3
What's Changed
- a change to
write_ped_phrank
(for wdlTools/DNAnexus compatibility) in v2.0.2 broke Cromwell compatibility; this has been fixed - a change to
pbsv_call
(for miniwdl compatibility) in v2.0.2 broke wdlTools/DNAnexus compatibility; this has been fixed - @informationsea fixed a typo in the stats output
- wiki submodule was removed, docs are in
docs
subdirectory
Full Changelog: v2.0.2...v2.0.3
v2.0.2
v1.2.1
Updated json2ped.py to handle missing sample sex input. 3d62f62
Note: The v1 branch will not receive new features.
Clone this branch with:
git clone \
--depth 1 --branch v1.2.1 \
--recursive \
https://github.com/PacificBiosciences/HiFi-human-WGS-WDL.git
Full Changelog: v1.2.0...v1.2.1
v2.0.1
- Fixed unreachable
pbsv_call
bug. - Modified
write_ped_phrank
behavior to correctly handle missing (null
) sample sex.
Full Changelog: v2.0.0...v2.0.1
v2.0.0
PacBio WGS Variant Pipeline v2.0.0
This is a major restructuring of the v1 workflow. Please read the documentation before filing issues.
Structural changes
- There are two entry-points, singleton.wdl and family.wdl.
singleton.wdl
has a flattened input/output structure that should have better compatibility with platforms like Terra.family.wdl
includes joint calling tasks for small variants and structural variants.- The
family.wdl
entrypoint can be used for both single sample (singleton) and multisample (duo, trio, quad, etc.) inputs, allowing for a single workflow to be used for all analyses. The per-sample outputs will be arrays in the same order as the sample input. Thesingleton.wdl
entrypoint will be maintained for backends that need flattened inputs and outputs.
- phenotype field has been changed from Array[String] to String, a comma-delimited string, e.g., "HP:0000118,HP:0000001"
- Static inputs like reference FASTA and BED files are now referenced through new "map" files to simplify inputs.json structure.
- Workflow
inputs.json
files have been greatly simplified. - Most tasks have been moved to the
wdl-common
submodule for reuse. - AWS AGC has been deprecated by AWS, and support has been removed.
- AWS HealthOmics support has been added (needs improved documentation). Added script to deploy container to private ECR repo for HealtOmics.
New features:
- If aligned BAMs are provided as input to the workflow, alignment and phasing information will be stripped and the reads will be realigned. If the input BAM has consensus kinetics tags, these will be stripped as well.
- Sex (or more specifically, presence or absence of chrY) is inferred by relative chrY aligned depth. This will never override user-defined sex, but is used if the sex is not provided by user.
- HiPhase now jointly phases small variants (DeepVariant), structural variants (PBSV), and tandem repeats (TRGT).
- Merged TRGT VCF will be generated by the family workflow.
- Pharmacogenomics analysis with StarPhase and PharmCAT.
- Updated tertiary analysis with gnomAD v4.1 and CoLoRSdb population datasets.
- High level summary statistics (e.g., mean depth, variant counts by type, etc) output directly by workflow in the form of workflow metadata output (e.g. miniwdl
outputs.json
) and a flatstats.txt
TSV. - Many QC plots have been added:
- read length histogram
- read quality histogram
- aligned depth distribution and cumulative depth distribution
- alignment MAPQ histogram
- alignment gap compressed identity histogram
- SNV distribution heatmap
- small indel size histogram
Tool updates
pbmm2 1.16.0
mosdepth v0.3.9
DeepVariant v1.6.1
pbsv v2.10.0
Paraphase v3.1.1
TRGT v1.2.0
HiPhase v1.4.5
HiFiCNV v1.0.1
pb-StarPhase v1.0.0
PharmCAT v2.15.4
slivar v0.3.1
CoLoRSdb v1.1.0
Thanks to:
v2.0.0-rc6
- Added flat stats file and added/formatted plots
- plots:
- read length histogram
- read quality histogram
- aligned depth distribution
- alignment MAPQ histogram
- alignment gap compressed identity histogram
- SNV distribution heatmap
- small indel size histogram
- stats:
- many stats added, output both in metadata as well as a flat file
- workflow metadata:
- output workflow_name and workflow_version into metadata
- plots:
- Compressed roh.out and cpg_pileup BED files to save space.
- Modifications to tertiary analysis tasks for better compatibility with DNAnexus and HealthOmics.
- Added script to deploy container to private ECR repo for HealtOmics. Documentation to follow.
- Tool and dataset updates:
- updated zenodo doi
- updated all
ref_map
andtertiary_map
files accordingly - updated pbmm2 to 1.16.0
- updated pbsv to 2.10.0
- updated HiFiCNV to 1.0.1
- updated pb-StarPhase to 1.0.1
- updated PharmCat to 2.15.4
Full Changelog: v2.0.0-rc5...v2.0.0-rc6