Feat: Check Remediation #89

adambaumeister · 2023-07-19T01:06:03Z

note; this is a customer requested feature

User Story

When a test fails, we should have the option of remediating or attempting to remediate the issue using standard methodology. For example, when an IPSEC tunnel is down following an upgrade, the upgrade assurance process provides the command for bouncing the tunnel.

Current Functionality

When we receive test failures, users must manually remediate them, and the way they do this is common for most users.

Possible Implementation

We could prescribe a set of remediation commands to each test that could be run if the user wants to attempt to remediate the issue. These could be provided as "fix" methods;
CheckFirewall.fix_ipsec_tunnel_status()

We could allow the user to provide their own fix methods or even CLI commands to the check functions:

def remediation_func():
  run_some_cli_command

CheckFirewall.check_ipsec_tunnel_status(remediation_func)

We could do nothing and consider it out of scope for this project.

Thoughts?

The text was updated successfully, but these errors were encountered:

FoSix · 2023-07-19T13:20:47Z

I would say normally that's not in the project's scope - we tests things, we give results, someone takes care of fixing it, if they feel it's fixable: manually or in a CI process that run the tests

We could add these type od methods, as you said, they are quite simple and quite generic for 90% cases. I would assume we target post-upgrade tests?
But, which tests would get such methods? From the current method the only one that would match is the one you mentioned, IPSEC tunnel check. And the rest?

active_support - no license, no support, nothing that could be done here
expired_licenses - same here, nothing to fix
candidate_config - not a post upgrade
ntp_sync - this one is tricky as to trigger a sync I think you have to restart the management plane? or am I wrong?
panorama - ?? no idea
certificates_requirements - not much to fix, new cert required, pre-upgrade
content_version - this is a pre- test, we could add a remediation to update the content DB before the upgrade, but I think we talked about it and the conclusion was this should be handled/scheduled by the customer
free_disk_space - pre upgrade
ha - that could fail due to config/version differences, or network, not sure how to fix it
planes_clock_sync - a fix is usually a restart of the dataplane or the device, pre-upgrade, shouldn't happen after the upgrade
arp_entry_exist - if it's not there, it's not :)
session_exist - same here

adambaumeister added the enhancement New feature or request label Jul 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: Check Remediation #89

Feat: Check Remediation #89

adambaumeister commented Jul 19, 2023

FoSix commented Jul 19, 2023

Feat: Check Remediation #89

Feat: Check Remediation #89

Comments

adambaumeister commented Jul 19, 2023

User Story

Current Functionality

Possible Implementation

FoSix commented Jul 19, 2023