Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create new workflow to use Dask task execution #3

Open
tiborsimko opened this issue Oct 9, 2024 · 0 comments
Open

create new workflow to use Dask task execution #3

tiborsimko opened this issue Oct 9, 2024 · 0 comments

Comments

@tiborsimko
Copy link
Member

Currently, this example uses Snakemake to parallelise jobs in a multi-cascading style (i.e. processing files indepedently for each sample, then merging each sample, then merging samples together).

It would be good to test the same demo example running on Dask (USE_DASK = True in the notebook.)

This could be done in two ways, (i) in addition to Snakemake, (ii) instead of Snakemake,.

For the former, the inputs.yaml file already contains a pre-prepared use_dask input parameter. However, it does not seem advantageous to use both Snakemake and Dask parallelisation to boot, since the multi-cascading nature of Snakefile would probably have to change considerably in order not to scatter "twice" via Snakemake, but rather let Dask to do some scattering. We can get to this in the future.

For now, let's try to do the latter, i.e. use Dask only for all the DAG job multi-cascading. We can do this by creating a new reana-dask.yaml workflow specification file that could even use the Serial workflow engine, and call the notebook with USE_DASK set to True, and llet Dask to do all the parallelisations to arrive at results.

We could then start comparing Snakemake-based parallelisation vs Dask-basked parallelisation and see how well they perform.

@tiborsimko tiborsimko added this to Dask Oct 9, 2024
@tiborsimko tiborsimko moved this to Ready for work in Dask Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Ready for work
Development

No branches or pull requests

1 participant