-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Best paramaters for a very fast, rough assembly #11
Comments
Yes, reducing the number of k-mers is the way to go. There are two k-mer loops in SKESA. The first one deals with the k-mers which are shorter than the read length. The number of the this iterations is controlled by --steps. The second loop uses long k-mers up to the insert length. It is always 3 iterations for paired_end reads or nothing for single-end reads. My suggestion for a bare-bones assembly of the example from releases:
On my computer this reduced the wall-clock time from 100s to 16s. The assembly will not be worse in terms of bad bases but will be more fragmented. The choice of --kmer is critical for fragmentation. My bet is 50%-80% of the read length but you'll have to experiment. You may try to use --hash_count. It will definitely save some memory, and for high coverage sample might save some time. Please, let me know the results of your experiments. |
Thanks so much for the response. I will do some experimenting! |
My first test was very promising, a Listeria isolate with no problem with genotyping:
|
Does changing |
By default SKESA clips adapters/vectors from reads before assembling. |
@tseemann did you notice any pros/cons of letting SKESA or some other tool trim adapters for you? |
@lskatz our Illumina software is setup to remove all adapters, by putting the nextera transposase in the SampleSheet.csv |
For some use cases, we want a very rough assembly but we need it very quickly.
What options would you suggest to reduce the runtime?
I'm guessing using less k-mers would speed things up.
Would be grateful for any ideas you have.
The text was updated successfully, but these errors were encountered: