-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reboots done via juju ssh
are causing hook errors intermittently
#921
Comments
There's a #!/bin/bash
sleep 15
shutdown -r now whereas the zaza code runs this: def reboot(unit_name):
"""Reboot unit.
:param unit_name: Unit Name
:type unit_name: str
:returns: None
:rtype: None
"""
# NOTE: When used with series upgrade the agent will be down.
# Even juju run will not work
cmd = ['juju', 'ssh', unit_name, 'sudo', 'reboot', '&&', 'exit'] Both |
What's interesting is that using
Neither can I use
However, I can do:
|
I filed a bug so that this can be addressed if the Juju team thinks it's a good idea https://bugs.launchpad.net/juju/+bug/1990140 Meanwhile I think we can use this:
|
That is so convoluted! i.e. "unit, please run this when you can". One thing we will need to watch for in tests is that this is really async, and so the time between issuing this command and the start of the reboot is logging.info("dist-upgrade required reboot machine: %s", machine)
await reboot(machine)
logging.info("Waiting for machine to come back afer reboot: %s",
machine)
await model.async_block_until_file_missing_on_machine(
machine, "/var/run/reboot-required")
logging.info("Waiting for machine idleness on %s", machine)
await asyncio.sleep(5.0)
await model.async_block_until_units_on_machine_are_idle(machine) does the right thing, so maybe it should be integrated into the |
Issue #921 has the context for the change: in order to avoid triggering reboots asynchronously to juju hook executions (when `juju ssh reboot` is done). openstack-charmers/zaza-openstack-tests#921 (cherry picked from commit 3b74604)
https://bugs.launchpad.net/juju/+bug/1989629 - the issue description itself
https://bugs.launchpad.net/juju/+bug/1989629/comments/4 - root cause analysis
enable_dpdk
) and reboots a machine viajuju ssh
to apply some changes;update-status
before the agent is stopped by systemd;update-status
execution.This was bugging us in https://review.opendev.org/c/x/charm-ovn-chassis/+/856548/ due to its intermittent nature.
The codepath that leads to this:
zaza-openstack-tests/zaza/openstack/charm_tests/ovn/tests.py
Line 305 in a65ef82
zaza-openstack-tests/zaza/openstack/charm_tests/test_utils.py
Line 693 in a65ef82
https://github.com/openstack-charmers/zaza/blob/8f9f9c79b246ef09a632d40323c975f002fcd4cf/zaza/utilities/machine_os.py#L231
https://github.com/openstack-charmers/zaza/blob/8f9f9c79b246ef09a632d40323c975f002fcd4cf/zaza/utilities/machine_os.py#L203
https://github.com/openstack-charmers/zaza/blob/8f9f9c79b246ef09a632d40323c975f002fcd4cf/zaza/utilities/generic.py#L493-L504
The text was updated successfully, but these errors were encountered: