Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with mini-batches in Controller and fixed nb of mb in Worker before sync. #24

Open
mgermain opened this issue Jan 13, 2016 · 4 comments
Labels

Comments

@mgermain
Copy link
Contributor

In the case where the Controller manages the mini-batches but, the Worker decides when to sync with the global parameters, you can encounter the problem where the Worker is waiting for more mini-batches before doing a sync but none is available.

A possible fix for this would be to let the Controller decide when a Worker should sync.

@mgermain mgermain added the bug label Jan 13, 2016
@abergeron
Copy link
Contributor

This seems like a convoluted and constructed scenario.

Having the minibatch dispatch and the controller in the same process should probably not even be supported since it is super slow anyway as the two tasks keep blocking each other due to the ZeroMQ design.

@mgermain
Copy link
Contributor Author

mgermain commented Feb 4, 2016

I'm talking about a simple basic use case, in separate process and all.

Let say you have 22 mini-batches total and 2 workers that sync every 10 mini-batches.
They each ask for the first 10 mb and sync, then ask for 1 or 2 and hang waiting for the next 8 mb before syncing. After the socket timeout, the worker will just crash.

There are ways around this but I think this should be easier to use or better-documented somehow.

@abergeron
Copy link
Contributor

The minibatch server should not have a limited supply. Or if it is limited it should be enough to fully satisfy each worker.

I don't think we should support any other use case.

@nouiz
Copy link
Contributor

nouiz commented Feb 5, 2016

So maybe only allow the minibatch server to send 10 minibatch at a time? so
the last 2 mini batch won't be used?

We should at least document this limitation. I don't think it is a priority
to have a better fix if the worker crash due to a timeout.

On Thu, Feb 4, 2016 at 4:01 PM, abergeron [email protected] wrote:

The minibatch server should not have a limited supply. Or if it is limited
it should be enough to fully satisfy each worker.

I don't think we should support any other use case.


Reply to this email directly or view it on GitHub
#24 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants