Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finding allele to mutate to in an efficient way #9

Open
frederic-michaud opened this issue Feb 2, 2018 · 3 comments
Open

Finding allele to mutate to in an efficient way #9

frederic-michaud opened this issue Feb 2, 2018 · 3 comments
Assignees

Comments

@frederic-michaud
Copy link
Collaborator

The way commit 4e87c16 handles the situation about how to check if a mutation is possible is not satisfactory.
If only two alleles exist, it has anyways to switch to the other one, expect is the probability to switch to it is zero.
Today, Elisa point to me some input file (too complicated to be reproduced here) where the simulation was very slow. Tracking things down, it turns out that the automatically build array for mutation frequency had the values 0.99999999... and 0.000000001... This might be seen as a bug though it was actually following a known path.
Yet, when an allele was stuck in the first position, it had a hard time finding a way to go to the second allele.

It would probably make much more sense to just check if only two alleles exist, and switch to the other one if p > 0. Since this is a sensible part of the code, I will see if I have time to change it and do some intensive testing, or if I leave it like this. Results are anyways correct, it is just much much slower (a factor of about 100 in the case of Elisa).

@sneuensc
Copy link
Collaborator

sneuensc commented Feb 12, 2018 via email

@frederic-michaud frederic-michaud changed the title Finding allele to mutate too in an efficient way Finding allele to mutate to in an efficient way Feb 12, 2018
@frederic-michaud
Copy link
Collaborator Author

Actually, I tried this on Friday, and it shows no improvement in the rapidity of the execution. I just added an if-else statement in the mutation function, making the code more cumbersome. I didn't commit it because adding complexity to the code without significative time-saving is not really the direction I want to take.

A cleaner way would probably be to write two different functions for the mutation and to adapt the pointer _mut_model_func_ptr depending on the value of _nb_allele.

In any case, since 4e87c16, mutation is not anymore a major bottleneck in Elisa's simulation, only about 30% time in spend doing it, and first step to improve it is either to do it at the genome level or at the population level, which would probably make it much faster but would need major refactoring.

@frederic-michaud
Copy link
Collaborator Author

After rechecking my result from Friday, I see that I was wrong. Actually, the new version is about twice faster than the old one. Yet, another bottleneck is present just before, which is to test whether mutation occurs of not. More precisely, the entire mutation step is about 30% of the time, but looking for a new allele is only about 0.4% of the time and 0.2% of the time with the new version.

If we include the new version, we would gain about 0.2% speed. I don't think it's worthwhile the cost. I let the issue open to have it in mind if I have to refactor this one day.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants