Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Check Remediation #89

Open
adambaumeister opened this issue Jul 19, 2023 · 1 comment
Open

Feat: Check Remediation #89

adambaumeister opened this issue Jul 19, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@adambaumeister
Copy link
Collaborator

note; this is a customer requested feature

User Story

When a test fails, we should have the option of remediating or attempting to remediate the issue using standard methodology. For example, when an IPSEC tunnel is down following an upgrade, the upgrade assurance process provides the command for bouncing the tunnel.

Current Functionality

When we receive test failures, users must manually remediate them, and the way they do this is common for most users.

Possible Implementation

  • We could prescribe a set of remediation commands to each test that could be run if the user wants to attempt to remediate the issue. These could be provided as "fix" methods;
    CheckFirewall.fix_ipsec_tunnel_status()
  • We could allow the user to provide their own fix methods or even CLI commands to the check functions:
    def remediation_func():
      run_some_cli_command
    
    CheckFirewall.check_ipsec_tunnel_status(remediation_func)
  • We could do nothing and consider it out of scope for this project.

Thoughts?

@adambaumeister adambaumeister added the enhancement New feature or request label Jul 19, 2023
@FoSix
Copy link
Contributor

FoSix commented Jul 19, 2023

I would say normally that's not in the project's scope - we tests things, we give results, someone takes care of fixing it, if they feel it's fixable: manually or in a CI process that run the tests

We could add these type od methods, as you said, they are quite simple and quite generic for 90% cases. I would assume we target post-upgrade tests?
But, which tests would get such methods? From the current method the only one that would match is the one you mentioned, IPSEC tunnel check. And the rest?

  • active_support - no license, no support, nothing that could be done here
  • expired_licenses - same here, nothing to fix
  • candidate_config - not a post upgrade
  • ntp_sync - this one is tricky as to trigger a sync I think you have to restart the management plane? or am I wrong?
  • panorama - ?? no idea
  • certificates_requirements - not much to fix, new cert required, pre-upgrade
  • content_version - this is a pre- test, we could add a remediation to update the content DB before the upgrade, but I think we talked about it and the conclusion was this should be handled/scheduled by the customer
  • free_disk_space - pre upgrade
  • ha - that could fail due to config/version differences, or network, not sure how to fix it
  • planes_clock_sync - a fix is usually a restart of the dataplane or the device, pre-upgrade, shouldn't happen after the upgrade
  • arp_entry_exist - if it's not there, it's not :)
  • session_exist - same here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants