You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was attempting to re-align reads that were aligned with another aligner.
I was using the --keep-tags option, primarily because I have RG tags and OQ tags that I care about on my reads. However, with --keep-tags, the other tags including MD, NM, MC, and AS are also copied. Since NGM also sets these tags, they are appended to the end of the read so that these tags all appear twice in the read. This is a violation of the SAM specification and consequently causes SAMtools to crash when it tries to parse the read.
As an example, the malformed read looks like this: 2 151M = 129799794 343 CCCTTGCTGCATGAGCCAGTAGCTGGGTGGGCATGGTAGCCTCTTGTCTTCCTAGCTTGCCCCTCCAGACATGGAACCTCCACACTGTGAGCGACTTGGTGTGGGGCAATCCAGGCAGATGTGCTCAGTCTGCCACACCTAGGATGGGGCT :862939:9=:=<<=9===<>4=>==<,;054=6;':=>8;/1/5;==?-<>??;<>>>9<<9?=&><7;;>28=.<<0:9-7>>@97<+<'+;3?>3)<:>[email protected]=@2:1-)>><?4?A).=??<)3=.;@>?A,*4@A5;#### MD:Z:10C48A91 PG:Z:MarkDuplicates.1E.5J RG:Z:HK2WY.5 NM:C:2 OQ:Z:####A7AA7,,FFFAA,A7,7FFA,,AF7AA<<,,7A7KF<,FFKKKFFAA<,7FAA<7,F,F7AKKFA,AA,FF,FA7FAF7FF(FKAFFAKKFFFKKKF,KKFF7,7,FAKF<,F7F<<<F,FKKKF<KAKKKAKKFKAFA<A<<,<<< UQ:C:22 AS:C:141 MQ:i:60 MC:Z:151M AS:i:1460 NM:i:2 NH:i:0 XI:f:0.9868 X0:i:0 XE:i:39 XR:i:151 MD:Z:91T48G10
For now, I'll get around this by getting the reads and aligning them as FASTQ, but if NGM is still being developed I think a good option would be to allow the user to specify which tags to keep when using --keep-tags, have NGM overwrite tags it outputs, or allow more user control over which tags are output by NGM.
The text was updated successfully, but these errors were encountered:
Hi,
I was attempting to re-align reads that were aligned with another aligner.
I was using the
--keep-tags
option, primarily because I have RG tags and OQ tags that I care about on my reads. However, with--keep-tags
, the other tags including MD, NM, MC, and AS are also copied. Since NGM also sets these tags, they are appended to the end of the read so that these tags all appear twice in the read. This is a violation of the SAM specification and consequently causes SAMtools to crash when it tries to parse the read.As an example, the malformed read looks like this:
2 151M = 129799794 343 CCCTTGCTGCATGAGCCAGTAGCTGGGTGGGCATGGTAGCCTCTTGTCTTCCTAGCTTGCCCCTCCAGACATGGAACCTCCACACTGTGAGCGACTTGGTGTGGGGCAATCCAGGCAGATGTGCTCAGTCTGCCACACCTAGGATGGGGCT :862939:9=:=<<=9===<>4=>==<,;054=6;':=>8;/1/5;==?-<>??;<>>>9<<9?=&><7;;>28=.<<0:9-7>>@97<+<'+;3?>3)<:>[email protected]=@2:1-)>><?4?A).=??<)3=.;@>?A,*4@A5;#### MD:Z:10C48A91 PG:Z:MarkDuplicates.1E.5J RG:Z:HK2WY.5 NM:C:2 OQ:Z:####A7AA7,,FFFAA,A7,7FFA,,AF7AA<<,,7A7KF<,FFKKKFFAA<,7FAA<7,F,F7AKKFA,AA,FF,FA7FAF7FF(FKAFFAKKFFFKKKF,KKFF7,7,FAKF<,F7F<<<F,FKKKF<KAKKKAKKFKAFA<A<<,<<< UQ:C:22 AS:C:141 MQ:i:60 MC:Z:151M AS:i:1460 NM:i:2 NH:i:0 XI:f:0.9868 X0:i:0 XE:i:39 XR:i:151 MD:Z:91T48G10
For now, I'll get around this by getting the reads and aligning them as FASTQ, but if NGM is still being developed I think a good option would be to allow the user to specify which tags to keep when using
--keep-tags
, have NGM overwrite tags it outputs, or allow more user control over which tags are output by NGM.The text was updated successfully, but these errors were encountered: