Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Announcing same jobs every 10 seconds #3

Open
jaraco opened this issue Jun 30, 2015 · 5 comments
Open

Announcing same jobs every 10 seconds #3

jaraco opened this issue Jun 30, 2015 · 5 comments

Comments

@jaraco
Copy link
Contributor

jaraco commented Jun 30, 2015

Originally reported by: Darwin Monroy (Bitbucket: dmonroy, GitHub: dmonroy)


When pipeline workers are down and there are jobs ready to be taken mettle start announcing those jobs in a loop, so the rabbitmq job queue starts to grow.

Here is the chart of messages after 1 hour of pipeline workers being down (dev env)
Screen Shot 2015-06-30 at 11.24.05.png

And the mettle logs shows this every 10 seconds:

#!logs

11:39:29 timer.1         | INFO:mettle.timer:Sleeping for 10 seconds
11:39:22 timer.1         | INFO:mettle.timer:Checking pipelines.
11:39:23 timer.1         | INFO:mettle.timer:Checking jobs.
...
11:39:23 timer.1         | INFO:mettle_protocol.messages:Announcing job .............
...
11:39:29 timer.1         | INFO:mettle.timer:Cleaning up old logs.
11:39:29 timer.1         | INFO:mettle.timer:Finished scheduled tasks.  Took 7.364271 seconds
11:39:29 timer.1         | INFO:mettle.timer:Sleeping for 10 seconds

26000 messages isn't a big number, but for a development environment with just a pipeline (50 jobs) is huge. Few weeks ago the production's mq server had become slow because of millions of messages in the queues, some mettle process lost the connectivity to the mq server and then died.


@jaraco
Copy link
Contributor Author

jaraco commented Jun 30, 2015

Original comment by Darwin Monroy (Bitbucket: dmonroy, GitHub: dmonroy):


Here's the queue chart:

Screen Shot 2015-06-30 at 11.57.16.png

@jaraco
Copy link
Contributor Author

jaraco commented Jun 30, 2015

Original comment by Darwin Monroy (Bitbucket: dmonroy, GitHub: dmonroy):


If there are no pipeline workers available, it must not announce a job.

@jaraco
Copy link
Contributor Author

jaraco commented Jun 30, 2015

Original comment by Darwin Monroy (Bitbucket: dmonroy, GitHub: dmonroy):


2 hours later
Screen Shot 2015-06-30 at 14.02.09.png

@jaraco
Copy link
Contributor Author

jaraco commented Jul 23, 2015

Original comment by Brent Tubbs (Bitbucket: btubbs, GitHub: btubbs):


Rather than having the timer and dispatcher try to determine whether there's a worker available to receive a message, I think we should put a TTL on the messages that the timer sends. If the timer is going to re-announce an unclaimed job every 60 seconds, then we can give those announcement messages a 60 second TTL so they're automatically dropped when they're no longer needed.

@jaraco
Copy link
Contributor Author

jaraco commented Jul 31, 2015

Original comment by Darwin Monroy (Bitbucket: dmonroy, GitHub: dmonroy):


That's a pretty good idea, I'll work on that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant