Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix genome size #146

Merged
merged 28 commits into from
Sep 25, 2023
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
25a27c6
Adding pretext ingestion docs
DLBPointon Sep 21, 2023
c4620c9
Update docs/usage.md
DLBPointon Sep 21, 2023
83c3f67
Fix for custom-getchromsizes
DLBPointon Sep 21, 2023
991ebca
Merge branch 'More-docs' of https://github.com/sanger-tol/treeval int…
DLBPointon Sep 21, 2023
0a44839
Removing <button type="button" class="collapsible">Open Collapsible</…
DLBPointon Sep 21, 2023
dc17a30
fix_genome_size
yumisims Sep 21, 2023
8fdd360
Updated docs to a singular file, i've used alot of HTML to make sure …
DLBPointon Sep 22, 2023
c7dfaaa
Deletion
DLBPointon Sep 22, 2023
f4ddbd9
Deletion
DLBPointon Sep 22, 2023
801c21a
Prettier
DLBPointon Sep 22, 2023
3505c9e
Adding pretext ingestion docs
DLBPointon Sep 21, 2023
37e79e9
Update docs/usage.md
DLBPointon Sep 21, 2023
fac87f2
Fix for custom-getchromsizes
DLBPointon Sep 21, 2023
d9946a5
Removing <button type="button" class="collapsible">Open Collapsible</…
DLBPointon Sep 21, 2023
d7f365f
Updated docs to a singular file, i've used alot of HTML to make sure …
DLBPointon Sep 22, 2023
af213ef
Deletion
DLBPointon Sep 22, 2023
877d9f0
Deletion
DLBPointon Sep 22, 2023
c7f77c8
Prettier
DLBPointon Sep 22, 2023
298e247
Merge branch 'More-docs' of https://github.com/sanger-tol/treeval int…
DLBPointon Sep 22, 2023
c39d28d
Regular markdown should work
muffato Sep 25, 2023
02d10f5
More changes
muffato Sep 25, 2023
750b0ec
More changes
muffato Sep 25, 2023
e0c8621
More changes (and prettier)
muffato Sep 25, 2023
7b58017
Changed genomic_alignment_data
muffato Sep 25, 2023
29e6598
Update usage.md
DLBPointon Sep 25, 2023
3d5ea17
Merge pull request #149 from sanger-tol/mm49_docs_reformat
DLBPointon Sep 25, 2023
5b4bb92
Merge branch 'More-docs' into fix_genome_size
DLBPointon Sep 25, 2023
580ee1c
Merge branch 'More-docs' into fix_genome_size
DLBPointon Sep 25, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ process {

// Files to be uploaded to the TreeVal JBrowse2 instance
// .genome, .gz.{tbi|csi}, .bigBed, .bigWig, .paf
withName: 'GENERATE_GENOME_FILE|TABIX_BGZIPTABIX|UCSC_BEDTOBIGBED|UCSC_BEDGRAPHTOBIGWIG|.*:.*:SYNTENY:MINIMAP2_ALIGN|.*:.*:GENERATE_GENOME:CUSTOM_GETCHROMSIZES' {
withName: 'GENERATE_GENOME_FILE|TABIX_BGZIPTABIX|UCSC_BEDTOBIGBED|UCSC_BEDGRAPHTOBIGWIG|.*:.*:SYNTENY:MINIMAP2_ALIGN|.*:.*:GENERATE_GENOME:GNU_SORT' {
publishDir = [
path: { "${params.outdir}/treeval_upload" },
mode: params.publish_dir_mode,
Expand Down
34 changes: 32 additions & 2 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,10 +93,31 @@ samtools index {prefix}.cram

Find information on this here: [PacBio Data Prep](pacbio.md)

#### PreText Accessory ingestion

Note: This will require you to install bigwigToBedGraph from the ucsc package. Instructions on downloading this can be found at [EXAMPLE #3](https://genome.ucsc.edu/goldenPath/help/bigWig.html#:~:text=Alternatively%2C%20bigWig%20files%20can%20be,to%20the%20Genome%20Browser%20server.)

The PreText files generated by the pipeline are not automatically ingested into the pretext files. For this you must use the following code:

```bash
cd {outdir}/hic_files

bigWigToBedGraph {coverage.bigWig} /dev/stdout | PretextGraph -i { your.pretext } -n "coverage"

bigWigToBedGraph {repeat_density.bigWig} /dev/stdout | PretextGraph -i { your.pretext } -n "repeat_density"

cat {telomere.bedgraph} | awk -v OFS="\t" '{$4 = 1000; print}'|PretextGraph -i { your.pretext } -n "telomere"

cat {gap.bedgraph} | awk -v OFS="\t" '{$4= 1000; print}'| PretextGraph -i { your.pretext } -n "gap"
```

## Full samplesheet

The samplesheet for this pipeline is as shown below. This yaml is parsed by the pipeline and converted into the relevant channels.
A real production version of this YAML can be found here: [nxOscDF5033.yaml](../assets/local_testing/nxOscDF5033.yaml)
YAML is "Yet Another Markdown Language", it is a human-readable format that we use to tell TreeVal a number of things. This includes assembly location, telomere motif, pacbio data files (in fasta.gz format) and HiC cram files. The full Yaml is detailed below.

### YAML contents

The following is an example YAML file we have used during production: [nxOscDF5033.yaml](../assets/local_testing/nxOscDF5033.yaml) and is shown below. This contains some annotations we believe to be helpful, information on the alignment, synteny, pacbio and hic are explained here: [pacbio](pacbio.md), [genealignment and synteny](genealignmentsynteny.md). [nxOscDF5033.yaml](../assets/local_testing/nxOscDF5033.yaml)
DLBPointon marked this conversation as resolved.
Show resolved Hide resolved

- `assembly`
- `sample_id`: ToLID of the sample.
Expand Down Expand Up @@ -286,3 +307,12 @@ We recommend adding the following line to your environment to limit this (typica
```console
NXF_OPTS='-Xms1g -Xmx4g'
```

## Nextflow memory requirements

In some cases, the Nextflow Java virtual machines can start to request a large amount of memory.
We recommend adding the following line to your environment to limit this (typically in `~/.bashrc` or `~./bash_profile`):

```console
NXF_OPTS='-Xms1g -Xmx4g'
```
yumisims marked this conversation as resolved.
Show resolved Hide resolved