Skip to content

Commit

Permalink
Merge pull request #43 from pdimens/docs_dev
Browse files Browse the repository at this point in the history
update verbiage
  • Loading branch information
pdimens authored Feb 19, 2024
2 parents fbe9bb5 + 90c4e71 commit 776d3a4
Show file tree
Hide file tree
Showing 10 changed files with 43 additions and 32 deletions.
10 changes: 5 additions & 5 deletions Modules/Align/bwa.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,11 +81,10 @@ if the primary alignment was marked as a duplicate. Duplicates get marked but **
- ignores (but retains) barcode information
- fast

The [BWA MEM](https://github.com/lh3/bwa) workflow is substantially simpler and faster than the EMA workflow
and maps all reads against the reference genome, no muss no fuss. Duplicates are marked using
[sambamba](https://lomereiter.github.io/sambamba/). The `BX:Z` tags in the read headers are still added
to the alignment headers, even though barcodes are not used to inform mapping. The `-m` threshold is used
for alignment molecule assignment.
The [BWA MEM](https://github.com/lh3/bwa) workflow is much simpler and faster than the EMA workflow
and maps all reads against the reference genome. Duplicates are marked using `samtools markdup`.
The `BX:Z` tags in the read headers are still added to the alignment headers, even though barcodes
are not used to inform mapping. The `-m` threshold is used for alignment molecule assignment.

```mermaid
graph LR
Expand All @@ -106,6 +105,7 @@ graph LR
```
+++ :icon-file-directory: BWA output
The `harpy align` module creates an `Align/bwa` directory with the folder structure below. `Sample1` is a generic sample name for demonstration purposes.
The resulting folder also includes a `workflow` directory (not shown) with workflow-relevant runtime files and information.
```
Align/bwa
├── Sample1.bam
Expand Down
6 changes: 3 additions & 3 deletions Modules/Align/ema.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,8 +97,7 @@ information, the EMA workflow is a bit more complicated under the hood. Reads wi
barcodes are aligned using EMA and reads without valid barcodes are separately mapped
using BWA before merging all the alignments together again. EMA will mark duplicates
within alignments, but the BWA alignments need duplicates marked manually using
[sambamba](https://lomereiter.github.io/sambamba/). Thankfully, you shouldn't need
to worry about any of these details.
`samtools markdup`.

```mermaid
graph LR
Expand All @@ -121,7 +120,8 @@ graph LR
```
+++ :icon-file-directory: EMA output
The `harpy align` module creates an `Align/ema` directory with the folder structure below. `Sample1` is a generic sample name for demonstration purposes.
The `harpy align` module creates an `Align/ema` directory with the folder structure below. `Sample1` is a generic sample name for demonstration purposes.
The resulting folder also includes a `workflow` directory (not shown) with workflow-relevant runtime files and information.
```
Align/ema
├── Sample1.bam
Expand Down
1 change: 1 addition & 0 deletions Modules/SV/leviathan.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,7 @@ graph LR
```
+++ :icon-file-directory: leviathan output
The `harpy variants --method leviathan` module creates a `Variants/leviathan` (or `leviathan-pop`) directory with the folder structure below. `sample1` and `sample2` are generic sample names for demonstration purposes.
The resulting folder also includes a `workflow` directory (not shown) with workflow-relevant runtime files and information.

```
Variants/leviathan/
Expand Down
1 change: 1 addition & 0 deletions Modules/SV/naibr.md
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,7 @@ graph LR
The `harpy sv --method naibr` module creates a `Variants/naibr` (or `naibr-pop`)
directory with the folder structure below. `sample1` and `sample2` are generic sample
names for demonstration purposes.
The resulting folder also includes a `workflow` directory (not shown) with workflow-relevant runtime files and information.

```
Variants/naibr/
Expand Down
1 change: 1 addition & 0 deletions Modules/demultiplex.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@ graph LR
+++ :icon-file-directory: demultiplexing output
The `harpy demultiplex` module creates an `Demultiplex/PREFIX` directory with the folder structure below, where `PREFIX` is the prefix of your input file that Harpy
infers by removing the file extension and forward/reverse distinction. `Sample1` and `Sample2` are generic sample names for demonstration purposes.
The resulting folder also includes a `workflow` directory (not shown) with workflow-relevant runtime files and information.
```
Demultiplex/PREFIX
├── Sample1.F.fq.gz
Expand Down
1 change: 1 addition & 0 deletions Modules/impute.md
Original file line number Diff line number Diff line change
Expand Up @@ -190,6 +190,7 @@ The `harpy impute` module creates an `Imputation` directory with the folder stru
are generic contig names from an imaginary `genome.fasta` for demonstration purposes. The directory `model1/`
is a generic name to reflect the corresponding parameter row in the stitch parameter
file, which would have explicit names in real use (e.g. `modelpseudoHaploid_useBXTrue_k10_s1_nGen50/`).
The resulting folder also includes a `workflow` directory (not shown) with workflow-relevant runtime files and information.

```
Impute/
Expand Down
1 change: 1 addition & 0 deletions Modules/phase.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,7 @@ graph LR
+++ :icon-file-directory: phasing output
The `harpy phase` module creates an `Phase` directory with the folder structure below. `Sample1` is a generic sample name for demonstration purposes.
If using the `--ignore-bx` option, the output directory will be named `Phase.noBX` instead.
The resulting folder also includes a `workflow` directory (not shown) with workflow-relevant runtime files and information.

```
Phase/
Expand Down
1 change: 1 addition & 0 deletions Modules/qc.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ graph LR

+++ :icon-file-directory: qc output
The `harpy qc` module creates a `QC` directory with the folder structure below. `Sample1` and `Sample2` are generic sample names for demonstration purposes.
The resulting folder also includes a `workflow` directory (not shown) with workflow-relevant runtime files and information.
```
QC/
├── Sample1.R1.fq.gz
Expand Down
1 change: 1 addition & 0 deletions Modules/snp.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,7 @@ graph LR
The `harpy snp` module creates a `Variants/METHOD` directory with the folder structure below where `METHOD` is what
you specify as the `--method` (mpileup or freebayes). `contig1` and `contig2` are generic contig names from an imaginary
`genome.fasta` for demonstration purposes.
The resulting folder also includes a `workflow` directory (not shown) with workflow-relevant runtime files and information.
```
Variants/METHOD
├── variants.normalized.bcf
Expand Down
52 changes: 28 additions & 24 deletions software.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,27 +9,31 @@ HARPY is the sum of its parts, and out of tremendous respect for the developers

Issues with specific tools might warrant a discussion with the authors/developers on the repositories of these projects.

| Software | Website | Publication |
|:-----------|:-------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------|
| bash | [website](https://www.gnu.org/software/bash/) | |
| bcftools | [website](https://samtools.github.io/bcftools/bcftools.html) | |
| bgzip | [website](http://www.htslib.org/doc/bgzip.html) | |
| bwa | [website](https://github.com/lh3/bwa) | [pubication](http://arxiv.org/abs/1303.3997) |
| click | [website](https://github.com/pallets/click) | |
| conda | [website](https://github.com/conda) | |
| EMA | [website](https://github.com/arshajii/ema) | [publication](https://www.biorxiv.org/content/early/2017/11/16/220236) |
| fastp | [website](https://github.com/OpenGene/fastp) | [publication](https://doi.org/10.1093/bioinformatics/bty560) |
| HapCUT2 | [website](https://github.com/vibansal/HapCUT2) | [publication](https://doi.org/10.1101/gr.213462.116) |
| LEVIATHAN | [website](https://github.com/morispi/LEVIATHAN) | [publication](https://doi.org/10.1101/2021.03.25.437002) |
| LRez | [website](https://github.com/morispi/LRez) | [publication](https://academic.oup.com/bioinformaticsadvances/article/1/1/vbab022/6375438?login=false) |
| mamba | [website](https://github.com/mamba-org/mamba) | |
| NAIBR | [website](https://github.com/raphael-group/NAIBR) + [fork](https://github.com/pontushojer/NAIBR) | [publication](https://doi.org/10.1093/bioinformatics/btx712) |
| python | [website](https://www.python.org/) | |
| rich | [webiste](https://github.com/Textualize/rich) | |
| rich-click | [website](https://github.com/ewels/rich-click) | |
| sambamba | [website](https://github.com/biod/sambamba) | [publication](https://doi.org/10.1093/bioinformatics/btv098) |
| samtools | [website](http://www.htslib.org/) | |
| seqtk | [website](https://github.com/lh3/seqtk) | |
| Snakemake | [website](https://github.com/snakemake/snakemake) | [publication](https://f1000research.com/articles/10-33/v1) |
| STITCH | [website](https://github.com/rwdavies/STITCH) | [publication](https://doi.org/10.1038%2Fng.3594) |
| whatshap | [website](https://github.com/whatshap/whatshap) | [publication](https://doi.org/10.1101/085050) |
| Software | Links |
|:------------|:--------------------------------------------------------------------------------------------------------------------|
| bash | [website](https://www.gnu.org/software/bash/) |
| bcftools | [website](https://samtools.github.io/bcftools/bcftools.html) |
| bgzip | [website](http://www.htslib.org/doc/bgzip.html) |
| bwa | [website](https://github.com/lh3/bwa), [publication](http://arxiv.org/abs/1303.3997) |
| click | [website](https://github.com/pallets/click) |
| conda | [website](https://github.com/conda) |
| EMA | [website](https://github.com/arshajii/ema), [publication](https://www.biorxiv.org/content/early/2017/11/16/220236) |
| fastp | [website](https://github.com/OpenGene/fastp), [publication](https://doi.org/10.1093/bioinformatics/bty560) |
| HapCUT2 | [website](https://github.com/vibansal/HapCUT2), [publication](https://doi.org/10.1101/gr.213462.116) |
| LEVIATHAN | [website](https://github.com/morispi/LEVIATHAN), [publication](https://doi.org/10.1101/2021.03.25.437002) |
| LRez | [website](https://github.com/morispi/LRez), [publication](https://academic.oup.com/bioinformaticsadvances/article/1/1/vbab022/6375438?login=false) |
| mamba | [website](https://github.com/mamba-org/mamba) |
| NAIBR | [website](https://github.com/raphael-group/NAIBR), [fork](https://github.com/pontushojer/NAIBR), [publication](https://doi.org/10.1093/bioinformatics/btx712) |
| plotly | [website](https://plotly.com/) |
| python | [website](https://www.python.org/) |
| R | [website](https://www.r-project.org/) |
| r-circlize | [website](https://github.com/jokergoo/circlize), [publication](https://doi.org/10.1093/bioinformatics/btu393) |
| r-tidyverse | [website](https://www.tidyverse.org/), [publication](https://doi.org/10.21105/joss.01686) |
| r-DT | [website](https://rstudio.github.io/DT/), [js-website](http://datatables.net) |
| rich | [website](https://github.com/Textualize/rich) |
| rich-click | [website](https://github.com/ewels/rich-click) |
| samtools | [website](http://www.htslib.org/) |
| seqtk | [website](https://github.com/lh3/seqtk) |
| Snakemake | [website](https://github.com/snakemake/snakemake), [publication](https://f1000research.com/articles/10-33/v1) |
| STITCH | [website](https://github.com/rwdavies/STITCH), [publication](https://doi.org/10.1038%2Fng.3594) |
| whatshap | [website](https://github.com/whatshap/whatshap), [publication](https://doi.org/10.1101/085050) |

0 comments on commit 776d3a4

Please sign in to comment.