-
Notifications
You must be signed in to change notification settings - Fork 369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
--USE_END_IN_UNPAIRED_READS
and --USE_UNPAIRED_CLIPPED_END
options for picard MarkDuplicates
have no effect
#1959
Comments
Looks like those options are specific to flow-based reads, therefore they will not be used at all unless you also set I've tried this with the example alignments you've provided and while it did run successfully, I'd recommend extra caution if you ever decide to use these options with non-flow-based single-end reads, since I'd assume they were not intended for such use. |
Thanks. Indeed using Maybe helpful to add to the two help messages that |
I agree with adding clarification(s) to the documentation. I'll try getting more details on whether these options are safe to be used with non-flow single-end reads before closing the issue. |
perhaps there should be some error if those options are used without |
Absolutely, if these are truly only meant to be used with flow-based reads. |
@meganshand @ilyasoifer tagging for opinions on this (and potentially similar issues with other options in this/other tools) |
@kockan - thanks! We will discuss and propose how to best deal with this. |
@meganshand and @kockan, @ChrisHIV - unless you see a clear use case for single end reads (that are not of a constant length in bases) we will update the help string as suggested by the issue to indicate that it should only be used for flow reads |
@ilyasoifer Sounds reasonable to me. One small additional request: if these options are set in the command-line arguments without |
#1976 was merged, which adds a FLOW_ prefix to these options, but to close out this ticket we also need a check that flow mode is active when these options are specified. |
@dror27 - can you please address the last comment so we can close this? |
Bug Report
Affected tool(s)
picard MarkDuplicates
with the--USE_END_IN_UNPAIRED_READS
and--USE_UNPAIRED_CLIPPED_END
optionsAffected version(s)
Latest public release version [3.1.1]
Description
The
--USE_END_IN_UNPAIRED_READS
and--USE_UNPAIRED_CLIPPED_END
options have no effect in minimal test data. From my understanding of the help messages for these options (reproduced at the bottom of this message), the former should toggle whether or not we consider unpaired reads to be duplicates if they have the same start position but different end positions, and the latter should toggle whether clipped ends of unpaired reads are included or excluded when determining duplicates (I do not understand whether inclusion/exclusion corresponds to true/false for this bool or vice versa).Steps to reproduce
The attached reads_sam.txt has 8 reads mapped to an 8-bp reference genome, attached as
reference_fasta.txt. (Both files have had their extensions changed to .txt to allow attachment.) The attached image shows the reads for convenience.
These reads all have the central 6bp mapped, but they vary in whether there is an additional base at one end or the other and whether that base is mapped or clipped. After renaming reads_sam.txt to reads_sam.sam to clarify the format, run e.g.
Expected behavior
The four output sam files corresponding to the four combinations of these two binary flags should vary in which subset of reads are included after removing duplicates, because the reads vary in their potential to be considered duplicates based on the description of the flags.
Actual behavior
The four output sam files are identical in their read content (containing reads 1 and 6).
Additional comments
I tried to follow the recommendation to first post on the forum, but clicking on the 'Sign in' tab takes me to this page where I cannot see any option to sign in or create a new profile.
And here is the help for those two options, for convenience:
The text was updated successfully, but these errors were encountered: