-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce computation time with very few number of subsamples #10
Comments
Hi Remi,
Best, |
Thank you very much for your quick answer. It would be great if you can have an updated version in a few days! Thank you again. Best |
Hi again! Just to follow on this by not taking into account the uncertainty it took 12,5 h to run. Regarding your previous message, for filtering based on fixed p-values, you meant fix.p-value < 0.01 right? Best |
Hi, I also add a new parameter, Back to your question, yes, my suggestion is to use fix.pv < 0.05 to select some genes and fit the para.pv to get more reliable p-values, if the computational time is still a problem. Another thing is that for genes with too many zeros (e.g., > 90%) the model will converge poorly. It would be better to filter out genes with almost all zeros. Thanks! Best, |
Sorry to unearth this a year later, but it feels pertinent to my query. Namely, I've got a dataset even larger than what was brought into the issue, with 20,000+ cells and 20,000+ genes. I think that this is quite representative of scenarios where people would like to use the tool, as it's common to have large datasets these days. In fact, I can see wanting to use this on even more cells! Based on the discussion here, along with information in the tutorial, I'd go with log-transformed counts and |
Hi Krzysztof,
Best regards, |
Hi!
Thank you for this great package.
Even by using 100 subsamples, it takes a very long time (more than 30h at the moment but it is still running) to run the function runPseudotimeDE with about 2000 cells and 5000 genes with 18 cores (and I have much more trajectory to test) .
Therefore, I was wondering, in your opinion, to reduce the computational time, is it better to ignore the pseudotime uncertainty and just take the fix.p-value? or Is it still better to take the uncertainty into account with very few number of subsamples (10, 5 or even 2)?
Thank you very much in advance.
The text was updated successfully, but these errors were encountered: