Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IsUserAwakeBeforeShutdown called shifter and fallback, right away. #250

Open
dneise opened this issue Oct 13, 2017 · 5 comments
Open

IsUserAwakeBeforeShutdown called shifter and fallback, right away. #250

dneise opened this issue Oct 13, 2017 · 5 comments
Assignees

Comments

@dneise
Copy link
Member

dneise commented Oct 13, 2017

This morning, SH called 20min before shutdown, because nobody was awake. so far so good.
But it called the fallback immediately and not after 15 minutes.
I do not know why.

@dneise
Copy link
Member Author

dneise commented Oct 13, 2017

from the logs (removed private information:

2017-10-13 05:50:09,844 - shifthelper.conditions - DEBUG - is_nobody_awake -> True
2017-10-13 05:50:09,907 - shifthelper.checks - DEBUG - Giving message status WARNING
2017-10-13 05:50:09,969 - shifthelper.notifiers - DEBUG - Got a message
2017-10-13 05:50:09,969 - shifthelper.notifiers - DEBUG - Message is over alert level
2017-10-13 05:50:09,969 - shifthelper.notifiers - DEBUG - Getting phone number of primary shifter
2017-10-13 05:50:09,990 - shifthelper.tools.shift - DEBUG - Found shifter <shifter>
2017-10-13 05:50:09,991 - shifthelper.notifiers - DEBUG - Getting phone number of fallback shifter
2017-10-13 05:50:09,991 - shifthelper.notifiers - INFO - Calling +*******83
2017-10-13 05:50:09,991 - custos.notify.twilio - DEBUG - Received message
2017-10-13 05:50:09,992 - custos.notify.twilio - DEBUG - Placing call
2017-10-13 05:50:10,902 - shifthelper.notifiers - INFO - Calling +********34
2017-10-13 05:50:10,903 - custos.notify.twilio - DEBUG - Received message
2017-10-13 05:50:10,903 - custos.notify.twilio - DEBUG - Placing call
2017-10-13 05:50:11,568 - shifthelper.tools.shift - DEBUG - Found shifter <shifter>
2017-10-13 05:50:11,947 - custos.notify.log - WARNING - Check: IsUserAwakeBeforeShutdown - Message: There is a shift at the moment and 
20min before shutdown and 
Parker not Awake

@dneise
Copy link
Member Author

dneise commented Oct 13, 2017

This is the part of code, responsible for calling the fallback:

if self._get_oldest_call_age() >= self.time_before_fallback:
    log.debug('Getting phone number of fallback shifter')
    numbers_to_call.append(self.phone_number_of_fallback_shifter())

This self._get_oldest_call_age() looks at a list of calls and just finds the oldest call. Calls are removed from this list, if they are too old and if they were already acknowledged.


I found a bug in the function removing old and acknowledged calls:

    def _remove_acknowledged_and_old_calls(self):
        """ from the list of not acknowledged calls
        remove all calls, which have been acknowledged on the web page

        Also remove calls older than 2 hours, to get out of
        a "call the backup shifter" dead lock
        """
        alerts = {a['uuid']: a for a in get_alerts()}
        if not alerts:
            return

        for msg in copy(self.not_acknowledged_messages):
            age = datetime.datetime.utcnow() - msg.timestamp
            if age > (self.max_time_for_fallback + self.time_before_fallback):
                self.not_acknowledged_messages.remove(msg)
            else:
                try:
                    alert = alerts[str(msg.uuid)]
                except KeyError:
                    continue

                if alert['acknowledged'] is True:
                    self.not_acknowledged_messages.remove(msg)

It says: When you cannot get any alerts from the website ... return. So it does not even try to remove too-old alerts, when it cannot find out what was acknoledged. This function is doing to independent things... and if the one thing is impossible, it does not even try the other idenpendent thing. That is of course a bug.

I think, I might have recently introduced this bug .. let me see.

@dneise
Copy link
Member Author

dneise commented Oct 13, 2017

Ah the reason, why it showed up today was:

Normally there is at least one alert in the list of alerts ... the dummy-alert from the start-up checklist. But today this dummy alert was not in the list. Either the shifter did not try it. Or it did not work. Either way ... the shifter should at least have it reported in the log-book. If not having called one of the SH expert in the evening, because if the dummy alert is not working, this is serious. ...

Well ... this error (not doing the dummy-alert) ... made us realize this bug. ... so it was a good thing in the end.

@dneise
Copy link
Member Author

dneise commented Oct 13, 2017

The fix is easy:

instead of

        alerts = {a['uuid']: a for a in get_alerts()}
        if not alerts:
            return

we just need:

        alerts = {a['uuid']: a for a in get_alerts()}

alerts will be an empty dict now and the code below is fine with an empty set of alerts...

@dneise
Copy link
Member Author

dneise commented Oct 13, 2017

We are running the shifthelper_heartbeat branch at the moment and indeed I edited this part ... but the same behaviour (returning, instead of carrying on with an empty set of alerts) was already there before my edit .. so it wasnt me :-D

But I should have seen that (embarrassed)

@dneise dneise self-assigned this Oct 13, 2017
dneise pushed a commit that referenced this issue Oct 13, 2017
I just removed this `if not alerts: return` part. but I though this
function doing two things ... was part of the problem so I took it apart.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant