-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
specify custom adapters to split_on_adapter #19
Comments
Hi @MortenEneberg, could you try setting PCR primers here: https://github.com/nanoporetech/duplex-tools/blob/master/duplex_tools/split_on_adapter.py#L36 then, run with the "PCR" setting like the command below:
It may give you the following, in which case we would have a think of how to retain the tail-barcode in the left read, and the head barcode in the right read.
Cheers |
Hi @onordesjo So I tried with just a few reads where I knew the structure. One was a read where nothing was supposed to happen having a structure like: The next one was a read with the following structure, supposed to be split into 3 reads, as I set the --allow-multiple-splits: And this was split into 2 reads with the following structures: Read 2: meaning that the following was discarded: Interestingly, the beginning of read 2 contains the end of PRIMER2, meaning that not all of the primer was in the cut out part of the read. I would have like these 3 reads to be the output instead: Same splits if run without I used the following primers in the split_on_adapter.py file:
On the following 4 reads, where I only presented the first 2 reads here: With this command: Cheers! |
Hi @MortenEneberg, Getting there it seems! If the last SEQ is rather short, it may be the case that it's masked (to not accidentally split reads right at the end). You could try to use the additional options |
Hi @onordesjo, just updated the splits for the second example (7/9 at 10AM) - made a small error I tried your suggested settings. The read where nothing was supposed to happen having a structure like: thus discarding: The next one was a read with the following structure, supposed to be split into 3 reads: With the new settings, this was split into 2 reads with the following structures: meaning that the following was discarded: Do you have any clue on how to solve this? Cheers! |
Hi @onordesjo, I appreciate your help! Did you have a chance to look at it yet? Kind regards |
Hi @MortenEneberg, sorry, I don't have much bandwidth to look at this at the moment. Would you be able to use a debugger to step through split_on_adapter and see where the decisions are made? I'd suggest starting on this line, which is where all results are found for matches against the subsequence: https://github.com/nanoporetech/duplex-tools/blob/master/duplex_tools/split_on_adapter.py#L142 |
Hi @onordesjo, Have you had the time to give it a look? Kind regards, |
Hi @MortenEneberg, I have started to look at it, but may probably need to add in the barcode sequences to this plot to get a better view of what should be written out. I've added in imperfect matches to both of your primer sequences (using |
Hi @onordesjo, Thank you for looking into it! I have attached the barcodes here, where also the sequencing adapter is: Single_barcodes_rev_for.txt Note that in the attached file it is the primer sequences. When reading one strand the barcodes in 5' and 3' ends will be the same Kind regards, |
Thanks Morten! I'll add those in and try to get it straightened out. My feeling is that it'll be easiest in this use case to use a standalone tool (since the front-adapter is not actually expected to be in the middle. Do note by the way that the targets being matched to are these: I forgot to point that out previously, but obviously relevant if you're not actually expecting part of the adapter to be between the primers: So basically what's being searched for is
|
Dear @onordesjo , Yes it looks correct! I have attached a paint image (not pretty..) just to make sure we are on the same page :) Morten |
Dear @onordesjo, Thanks for your help! Did you have a chance to look at it yet? Kind regards, |
Hi Morten! Sorry, I wasn't clear on the last message. I don't think it's something we're planning to support since it's a rather special use case. You could definitely it a go to replace:
with:
and see if you get the right matches then. Again, sorry for not being clear and not being able to put more resource on this! |
We see an extensive amount of chimeras with no intermediate sequencing adapters using the ligation sequencing kit 110. We use the SQK-LSK110 on multiplexed samples that have 24-nt barcodes in each end. After barcoding (in index PCR) the samples are pooled and used as input to the SQK-LSK110 kit. We have up to 24 different barcode sets.
Chimeric sequences have structures like:
5'-ADAPTER-Y-TOP-BARCODEX-PRIMER-SEQ-PRIMER-BARCODEX-BARCODEY-PRIMER-SEQ-PRIMER-BARCODEY-3'
Where adapter-Y-top is the ONT sequencing adapter, barcode X and Y are the barcodes introduced by an index PCR targetting PRIMER sites. The primer sites are ligated to the sequence of interest (SEQ). We have observed up to 14 barcode pairs in a single read.
Would like to split to at sites where barcodes are adjacent to each other, so that the read above becomes:
5'-ADAPTER-Y-TOP-BARCODEX-PRIMER-SEQ-PRIMER-BARCODEX-3'
and5'-BARCODEY-PRIMER-SEQ-PRIMER-BARCODEY-3'
The text was updated successfully, but these errors were encountered: