Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support minchunksize #114

Open
MasonProtter opened this issue Aug 12, 2024 · 3 comments
Open

Support minchunksize #114

MasonProtter opened this issue Aug 12, 2024 · 3 comments

Comments

@MasonProtter
Copy link
Member

MasonProtter commented Aug 12, 2024

I think a nice additional option would be a minchunksize that goes with nchunks. Basically, the way I'd want it to work is

scheduler=StaticScheduler(nchunks=10, minchunksize=5)
tforeach(f, 1:3; scheduler)        # does not spawn tasks (serial fallback)
tforeach(f, 1:10; scheduler)       # spawns 2 tasks
tforeach(f, 1:20; scheduler)       # spawns 4 tasks
tforeach(f, 1:100; scheduler)      # spawns 10 tasks
tforeach(f, 1:10_000; scheduler)   # still spawns 10 tasks

spawn 10 tasks. I think this is really useful for the majority of naively parallel operations on well behaved data when the mapping / reducing functions are cheap, because you can basically say "I want this parallelized, but don't create more tasks than necessary"

@carstenbauer
Copy link
Member

Sounds like a good idea to me.

@carstenbauer
Copy link
Member

Might make sense to support this upstream in ChunkSplitters.jl (cc @lmiq).

@carstenbauer
Copy link
Member

Should be easily doable now that we use ChunkSplitters v3.x, which has this as minsize.

@carstenbauer carstenbauer added the good first issue Good for newcomers label Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants