Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration of "future" package #150

Open
krlmlr opened this issue Feb 6, 2017 · 5 comments
Open

Integration of "future" package #150

krlmlr opened this issue Feb 6, 2017 · 5 comments

Comments

@krlmlr
Copy link
Collaborator

krlmlr commented Feb 6, 2017

It would be awesome if remake used BatchJobs as backend to perform its computation. BatchJobs has also a "local" mode that performs the computation in the same R process, so this could well be the default mode of operation.

The future and future.BatchJobs packages could simplify the implementation.

CC @mllg @berndbischl @HenrikBengtsson.

@richfitz
Copy link
Owner

richfitz commented Feb 6, 2017

BatchJobs development has been moved to batchtools I believe.

I have some reservations about this:

  • using batchtools as the default would add both (presumably) some overhead and IMO an unacceptably long dependency chain
  • I have a set of of packages with considerable overlap here (queuer, rrq, etc) which share the same storr datastore; connecting these together has been a long-term goal (see Parallelise tasks with parallel #84) (the batch* packages and future are both nice though in that they are one of few parallel focussed packages that don't focus purely on doing just a blocking parallel map).

The future package I really like the look of, though I've not used it in my own work yet (but then I'm not actually doing my own computational work anymore).

Given that all that is really needed in any case is a submit/fetch, perhaps a useful general approach would be some sort of generic interface that could support any of the queuing packages. That does end up implementing an interface-to-interfaces though, which seems a bit overcomplicated

@krlmlr
Copy link
Collaborator Author

krlmlr commented Feb 6, 2017

Thanks.

To me it looks like we'd need to replace lapply(plan, ...) by a scheduler in remake_make1(). That scheduler would:

  • keep a list of "open" tasks
  • schedule only "ready" tasks
  • query completion of "running" tasks and update "ready" state, until done

The future package has only two dependencies, globals and listenv; perhaps globals could even be made "suggested". I think future is already a nice interface-to-interfaces; I'd expect the overhead of the "eager local" operation mode of future to be sufficiently small if at all noticeable. What remains is the overhead of our scheduler.

An adapter between future and querer/rrq would be a way of integrating your queues with future, and consequently with remake.

@richfitz
Copy link
Owner

richfitz commented Feb 6, 2017

The globals package could be possibly used internally for some of the code scouring (as mooted in #64, #121)

@berndbischl
Copy link

BatchJobs development has been moved to batchtools I believe.

this is correct. if you want to go further here, please use batchtools.

@HenrikBengtsson
Copy link

HenrikBengtsson commented Feb 6, 2017

FYI, it's pretty high up on my todo list to get a first version of future.batchtools up and running. Most of the work will be to port existing future.BatchJobs code over and adjust according to what's in the suggested migration docs. It will require me to find some free deep-focus time, but after than it should be quick, e.g. the API and all package / redundancy tests are basically already in place (= the same).

So, obviously biased but also why future exists in the first, by utilizing the Future API you should be able to get what you need and then the user can choose to use whatever backend they want.

UPDATE: The future.batchtools package is on CRAN since 2017-06-03.

@krlmlr krlmlr changed the title BatchJobs integration Integration of "future" package Feb 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants