Skip to content

Commit

Permalink
Merge pull request #161 from sanger-tol/kmer_plot
Browse files Browse the repository at this point in the history
Output k-mer spectra
  • Loading branch information
weaglesBio authored Oct 18, 2023
2 parents 2413fe5 + c23127c commit 83de0c3
Show file tree
Hide file tree
Showing 12 changed files with 460 additions and 18 deletions.
8 changes: 4 additions & 4 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ jobs:
run: |
nextflow run ${GITHUB_WORKSPACE} -entry RAPID -profile test_github,docker --outdir ./results-rapid
- name: Run FULL pipeline with test data
# Remember that you can parallelise this by using strategy.matrix
run: |
nextflow run ${GITHUB_WORKSPACE} -profile test_github,docker --outdir ./results-full
#- name: Run FULL pipeline with test data
# # Remember that you can parallelise this by using strategy.matrix
# run: |
# nextflow run ${GITHUB_WORKSPACE} -profile test_github,docker --outdir ./results-full
30 changes: 21 additions & 9 deletions conf/base.config
Original file line number Diff line number Diff line change
Expand Up @@ -101,14 +101,14 @@ process {
withName: '.*:.*:LONGREAD_COVERAGE:(MINIMAP2_ALIGN|MINIMAP2_ALIGN_SPLIT)' {
cpus = { check_max( 16 * 1, 'cpus' ) }
memory = { check_max( 100.GB * task.attempt, 'memory' ) }
time = { check_max( 18.h * task.attempt, 'time' ) }
time = { check_max( 20.h * task.attempt, 'time' ) }
}

// For Large complex genomes > 4Gb
// withName: '.*:.*:LONGREAD_COVERAGE:(MINIMAP2_ALIGN|MINIMAP2_ALIGN_SPLIT)' {
//cpus = { check_max( 20 * 1, 'cpus' ) }
//memory = { check_max( 400.GB * task.attempt, 'memory' ) }
// time = { check_max( 300.h * task.attempt, 'time' ) }
// For Large complex genomes > 4Gb
//withName: '.*:.*:LONGREAD_COVERAGE:(MINIMAP2_ALIGN|MINIMAP2_ALIGN_SPLIT)' {
// cpus = { check_max( 20 * 1, 'cpus' ) }
// memory = { check_max( 400.GB * task.attempt, 'memory' ) }
// time = { check_max( 300.h * task.attempt, 'time' ) }
//}

withName: '.*:.*:LONGREAD_COVERAGE:SAMTOOLS_SORT' {
Expand All @@ -121,6 +121,12 @@ process {
memory = { check_max( 25.GB * Math.ceil( task.attempt * 2 ), 'memory' ) }
}

// For larger
//withName:MUMMER {
// cpus = { check_max( 12 * task.attempt, 'cpus' ) }
// memory = { check_max( 50.GB * Math.ceil( task.attempt * 2 ), 'memory' ) }
//}

withName:UCSC_BEDGRAPHTOBIGWIG {
cpus = { check_max( 2 * task.attempt, 'cpus' ) }
memory = { check_max( 20.GB * task.attempt, 'memory' ) }
Expand Down Expand Up @@ -174,8 +180,14 @@ process {

// Large Genomes > 4Gb
//withName: BUSCO {
//cpus = { check_max( 30 * task.attempt, 'cpus' ) }
//memory = { check_max( 120.GB * task.attempt, 'memory' ) }
//time = { check_max( 300.h * task.attempt, 'time' ) }
// cpus = { check_max( 30 * task.attempt, 'cpus' ) }
// memory = { check_max( 100.GB * task.attempt, 'memory' ) }
// time = { check_max( 300.h * task.attempt, 'time' ) }
//}

// Large Genomes > 4Gb
withName: FASTK_FASTK {
cpus = { check_max( 25 * task.attempt, 'cpus' ) }
memory = { check_max( 100.GB * task.attempt, 'memory' ) }
}
}
13 changes: 12 additions & 1 deletion conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ process {

// Files to be used for pretext, likely to be deleted once the hic workflow is complete.
// .bed, .hr.pretext, .lr.pretext, needs centromere}
withName: 'SEQTK_CUTN|GAP_LENGTH|PRETEXTMAP_HIGHRES|PRETEXTMAP_STANDRD|COOLER_ZOOMIFY|COV_FOLDER|UCSC_BEDGRAPHTOBIGWIG|EXTRACT_TELO|JUICER_TOOLS_PRE|SNAPSHOT_SRES|SNAPSHOT_HRES|GET_PAIRED_CONTACT_BED' {
withName: 'SEQTK_CUTN|GAP_LENGTH|PRETEXTMAP_HIGHRES|PRETEXTMAP_STANDRD|COOLER_ZOOMIFY|COV_FOLDER|UCSC_BEDGRAPHTOBIGWIG|EXTRACT_TELO|JUICER_TOOLS_PRE|SNAPSHOT_SRES|SNAPSHOT_HRES' {
publishDir = [
path: { "${params.outdir}/hic_files" },
mode: params.publish_dir_mode,
Expand Down Expand Up @@ -278,4 +278,15 @@ process {
ext.args = { '-k2,2 -nr' }
}

withName: FASTK_FASTK {
ext.args = "-k31 -t"
}

withName: MERQURYFK_MERQURYFK {
publishDir = [
path: { "${params.outdir}/hic_files" },
mode: params.publish_dir_mode,
pattern: '*.ref.spectra-cn.ln.png'
]
}
}
15 changes: 15 additions & 0 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d
- [selfcomp](#selfcomp) - Identifies regions of self-complementary sequence.
- [synteny](#synteny) - Generates syntenic alignments between other high quality genomes.
- [busco-analysis](#busco-analysis) - Uses BUSCO to identify ancestral elements. Also use to identify ancestral Lepidopteran genes (merian units).
- [kmer](#kmer) - Counts k-mer and generates a copy number spectra plot.

- [pipeline-information](#pipeline-information) - Report metrics generated during the workflow execution

Expand Down Expand Up @@ -218,6 +219,20 @@ This worflows searches along predetermined path for syntenic genome files based

![Workflow Legend](https://raw.githubusercontent.com/sanger-tol/treeval/dev/docs/images/treeval_1_0_legend.jpeg)

## kmer

This worflows performs a k-mer count using [FASTK_FASTK](https://nf-co.re/modules/fastk_fastk) then passes the results to [MERQURYFK_MERQURYFK](https://nf-co.re/modules/merquryfk_merquryfk) to plot a copy-number k-mer spectra.

<details markdown="1">
<summary>Output files</summary>

- `hic_files/`
- `*.ref.spectra-cn.ln.png`: .png file of copy number k-mer spectra.

</details>

![Workflow Legend](https://raw.githubusercontent.com/sanger-tol/treeval/dev/docs/images/treeval_1_0_legend.jpeg)

## pipeline-information

[Nextflow](https://www.nextflow.io/docs/latest/tracing.html) provides excellent functionality for generating various reports relevant to the running and execution of the pipeline. This will allow you to troubleshoot errors with the running of the pipeline, and also provide you with other information such as launch commands, run times and resource usage.
Expand Down
10 changes: 10 additions & 0 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -76,11 +76,21 @@
"installed_by": ["modules"],
"patch": "modules/nf-core/custom/getchromsizes/custom-getchromsizes.diff"
},
"fastk/fastk": {
"branch": "master",
"git_sha": "29e87a37ae1887fc8289f2f56775604a71715cb9",
"installed_by": ["modules"]
},
"gnu/sort": {
"branch": "master",
"git_sha": "88f6e982fb8bd40488d837b3b08a65008e602840",
"installed_by": ["modules"]
},
"merquryfk/merquryfk": {
"branch": "master",
"git_sha": "6f150e1503c0826c21fedf1fa566cdbecbe98ec7",
"installed_by": ["modules"]
},
"minimap2/align": {
"branch": "master",
"git_sha": "603ecbd9f45300c9788f197d2a15a005685b4220",
Expand Down
41 changes: 41 additions & 0 deletions modules/nf-core/fastk/fastk/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

52 changes: 52 additions & 0 deletions modules/nf-core/fastk/fastk/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

58 changes: 58 additions & 0 deletions modules/nf-core/merquryfk/merquryfk/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 83de0c3

Please sign in to comment.