Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tag handling with --keep-tags creates invalid SAM output #53

Open
adamjorr opened this issue Sep 15, 2020 · 1 comment
Open

Tag handling with --keep-tags creates invalid SAM output #53

adamjorr opened this issue Sep 15, 2020 · 1 comment

Comments

@adamjorr
Copy link

Hi,

I was attempting to re-align reads that were aligned with another aligner.
I was using the --keep-tags option, primarily because I have RG tags and OQ tags that I care about on my reads. However, with --keep-tags, the other tags including MD, NM, MC, and AS are also copied. Since NGM also sets these tags, they are appended to the end of the read so that these tags all appear twice in the read. This is a violation of the SAM specification and consequently causes SAMtools to crash when it tries to parse the read.

As an example, the malformed read looks like this: 2 151M = 129799794 343 CCCTTGCTGCATGAGCCAGTAGCTGGGTGGGCATGGTAGCCTCTTGTCTTCCTAGCTTGCCCCTCCAGACATGGAACCTCCACACTGTGAGCGACTTGGTGTGGGGCAATCCAGGCAGATGTGCTCAGTCTGCCACACCTAGGATGGGGCT :862939:9=:=<<=9===<>4=>==<,;054=6;':=>8;/1/5;==?-<>??;<>>>9<<9?=&><7;;>28=.<<0:9-7>>@97<+<'+;3?>3)<:>[email protected]=@2:1-)>><?4?A).=??<)3=.;@>?A,*4@A5;#### MD:Z:10C48A91 PG:Z:MarkDuplicates.1E.5J RG:Z:HK2WY.5 NM:C:2 OQ:Z:####A7AA7,,FFFAA,A7,7FFA,,AF7AA<<,,7A7KF<,FFKKKFFAA<,7FAA<7,F,F7AKKFA,AA,FF,FA7FAF7FF(FKAFFAKKFFFKKKF,KKFF7,7,FAKF<,F7F<<<F,FKKKF<KAKKKAKKFKAFA<A<<,<<< UQ:C:22 AS:C:141 MQ:i:60 MC:Z:151M AS:i:1460 NM:i:2 NH:i:0 XI:f:0.9868 X0:i:0 XE:i:39 XR:i:151 MD:Z:91T48G10

For now, I'll get around this by getting the reads and aligning them as FASTQ, but if NGM is still being developed I think a good option would be to allow the user to specify which tags to keep when using --keep-tags, have NGM overwrite tags it outputs, or allow more user control over which tags are output by NGM.

@fritzsedlazeck
Copy link
Member

Thanks. yes its been a while since we did some changes on the code..
Cheers
Fritz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants