-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optional alignment subworkflow #83
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- You're missing
align = true
inconf/test_raw.config
- I think it's confusing that the
input
channel insubworkflows/local/coverage_stats.nf
has different shapes depending on a parameter. I would suggest that you move theSAMTOOLS_VIEW
that creates the CSI index toworkflows/blobtoolkit.nf
(in theelse
ofif (params.align)
) so thatch_aligned
is always the BAM + CSI - There might be a problem with
bin/samplesheet.py
: by allowing*.bam
there might be a clash duringSAMTOOLS_VIEW
because the input and output files could be named the same ? Maybe theelse
I was referring to in my previous point could check the file extension. If a BAM, then samtools index. If a CRAM, then samtools view ? - You need to add
pacbio_clr
as an allowed datatype inbin/samplesheet.py
These tests ran on the farm without errors using
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the Snakemake version of the minimap2 alignment the input are reads in FASTQ format while in this case CRAM/BAM files are transformed to FASTA and aligned again using minimap2: What is the advantage for an user in doing so instead of simply using the input alignment file? Only wanted to clarify why this is useful.
The following test worked as expected with Nextflow 22.10.1:
nextflow run . -profile test_raw,singularity -ansi-log false
Prepare genome subworkflow
Unaligned files can also be in CRAM, BAM format. So if the user provides that and the align flag, now we can align it here and run the rest of the steps. Earlier, the user was required to provide aligned files. This is a step in making this pipeline independent. I can look at adding support for FASTQ as well, but after we sort out the config file. |
Closes #79
This PR adds an optional alignment subworkflow. It replicates the one currently in BTK. It is not meant to be high quality alignments. Quick sub-optimal alignments are good enough for BTK and hence this approach with minimap is implemented. Please add suggestions for arguments to improve alignment with Minimap without introducing more extensive aligners like BWAmem.
I would especially like your input on
bin/samplesheet.py
,subworkflows/local/coverage_stats.nf
,subworkflows/local/minimap_alignment.nf
andworkflows/blobtoolkit.nf